Skip to content

Conversation

@noahho
Copy link
Collaborator

@noahho noahho commented Oct 28, 2025

Issue

Notebooks errored because openml download didn't run - this is fixed with this using UCI instead.

@noahho noahho requested a review from a team as a code owner October 28, 2025 18:57
@noahho noahho requested review from brendan-priorlabs and Copilot and removed request for a team October 28, 2025 18:57
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@gemini-code-assist
Copy link
Contributor

Note

Gemini is unable to generate a review for this pull request due to the file types involved not being currently supported.

Copy link
Contributor

@brendan-priorlabs brendan-priorlabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let me know if you have any tips on how to review notebooks. I checked the code out locally, but even then the diff was so big I wasn't really sure what to focus on. The following bit stood out though and seems reasonable, so approving on account of that!

       "source": [
-        "# Parkinson's Disease dataset: Predict Parkinson's disease presence\n",
-        "# Features: Voice measurements (e.g., frequency, amplitude)\n",
-        "# Samples: 195 cases\n",
-        "df = fetch_openml(\"parkinsons\")\n",
+        "# Load Parkinsons dataset described above\n",
         "\n",
-        "X, y = df.data, df.target\n",
+        "import pandas as pd, io, zipfile, requests\n",
         "\n",
-        "# Print dataset description\n",
-        "display(Markdown(df[\"DESCR\"]))\n",
+        "url_zip = \"https://archive.ics.uci.edu/static/public/174/parkinsons.zip\"\n",
+        "with requests.get(url_zip) as r:\n",
+        "    r.raise_for_status()\n",
+        "    zf = zipfile.ZipFile(io.BytesIO(r.content))\n",
+        "    df = pd.read_csv(zf.open(\"parkinsons.data\"))\n",
+        "X, y = df.drop([\"status\", \"name\"], axis=1), df[\"status\"]\n",
         "\n",
         "display(X)"
       ]
     },

@noahho noahho merged commit 549989e into main Oct 29, 2025
10 checks passed
oscarkey pushed a commit that referenced this pull request Nov 12, 2025
…tability (#214)

* Record copied public PR 572

* Loading files from UCI instead of openml now for stability (#572)

(cherry picked from commit 549989e)

---------

Co-authored-by: mirror-bot <mirror-bot@users.noreply.github.com>
Co-authored-by: Noah Hollmann <noah@priorlabs.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants