{"cells": [{"cell_type": "markdown", "id": "db7f08c1", "metadata": {}, "source": ["## Open Government Data, provided by **Canton Zurich**\n", "*Autogenerated Python starter code for data set with identifier* **112@statistisches-amt-kanton-zuerich**"]}, {"cell_type": "markdown", "id": "164c4327", "metadata": {}, "source": ["## Dataset\n", "# **Bruttoverschuldungsanteil [%]**"]}, {"cell_type": "markdown", "id": "46d13c8b-4ac3-4e2f-940f-0c14ec5b7c0d", "metadata": {}, "source": ["## Description\n", "\n", "Bruttoschulden in Prozent vom laufenden Ertrag. Der Bruttoverschuldungsanteil ist eine Gr\u00f6sse zur Beurteilung der Verschuldungssituation der Gemeinde. Er zeigt den Anteil des laufenden Ertrags, der zum Abtragen der Bruttoschulden notwendig ist. So l\u00e4sst sich beurteilen, ob die Verschuldung in einem angemessenen Verh\u00e4ltnis zu den erwirtschafteten Ertr\u00e4gen steht. Grundlage ist die Konsolidierte Gemeinde. Der Haushalt der Schulgemeinde wird dabei anteilsm\u00e4ssig in den Haushalt der poltischen Gemeinden umgelegt. Umstellung des Rechnungsmodells von HRM1 auf HRM2 ab Rechnungsjahr 2019."]}, {"cell_type": "markdown", "id": "cad813bb-c986-4bb4-b52b-f78bb608086f", "metadata": {}, "source": ["## Data set links\n", "\n", "[Direct data shop link for dataset](https://www.zh.ch/de/politik-staat/statistik-daten/datenkatalog.html#/datasets/112@statistisches-amt-kanton-zuerich)"]}, {"cell_type": "markdown", "id": "9d4813e9", "metadata": {}, "source": ["## Metadata\n", "- **Issued** `2016-01-20T20:16:00`\n- **Modified** `2024-10-04T10:13:12`\n- **Startdate** `1990-12-31`\n- **Enddate** `2023-12-31`\n- **Theme** `['http://publications.europa.eu/resource/authority/data-theme/GOVE', 'http://publications.europa.eu/resource/authority/data-theme/ECON']`\n- **Keyword** `['bezirke', 'verschuldungsanteil', 'gemeindefinanzen', 'gemeinden', 'kanton_zuerich', 'oeffentliche_finanzen', 'ogd']`\n- **Publisher** `['Statistisches Amt des Kantons Z\u00fcrich']`\n- **Landingpage** `https://www.zh.ch/de/politik-staat/gemeinden/gemeindeportraet.html`\n"]}, {"cell_type": "markdown", "id": "8a857d65", "metadata": {"jp-MarkdownHeadingCollapsed": true, "tags": []}, "source": ["## Imports and helper functions"]}, {"cell_type": "code", "execution_count": null, "id": "93b39602-1c1e-46d2-ae70-1716b1481e9b", "metadata": {"tags": []}, "outputs": [], "source": ["%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "plt.style.use('ggplot')\n", "\n", "params = {\n", " 'text.color': (0.25, 0.25, 0.25),\n", " 'figure.figsize': [18, 6],\n", " }\n", "\n", "plt.rcParams.update(params)\n", "\n", "import pandas as pd "]}, {"cell_type": "code", "execution_count": null, "id": "aa6611d7-e1c0-40a7-b0ff-601a8ef8439b", "metadata": {"tags": []}, "outputs": [], "source": ["# helper function for reading datasets with proper separator\n", "def get_dataset(url):\n", " if url[-3:] != \"csv\":\n", " print(\"The data set URL has no proper 'csv' extension. Reading the dataset might not have worked as expected.\\nPlease check the dataset link and adjust pandas' read_csv() parameters accordingly.\")\n", " data = pd.read_csv(url, sep=\",\", on_bad_lines='warn', encoding_errors='ignore', low_memory=False)\n", " # if dataframe only has one column or less the data is not comma separated, use \";\" instead\n", " if data.shape[1] <= 1:\n", " data = pd.read_csv(url, sep=';', on_bad_lines='warn', encoding_errors='ignore', low_memory=False)\n", " if data.shape[1] <= 1:\n", " print(\"The data wasn't imported properly. Very likely the correct separator couldn't be found.\\nPlease check the dataset manually and adjust the code.\")\n", " return data"]}, {"cell_type": "markdown", "id": "02ce518f", "metadata": {}, "source": ["## Load data\n", "\n", "- The dataset has **`1` distribution(s)** in CSV format.\n", "- All available CSV distributions are listed below and can be read into a pandas dataframe."]}, {"cell_type": "code", "execution_count": null, "id": "0", "metadata": {"tags": []}, "outputs": [], "source": "# Distribution 0\n# Ktzhdistid : 92\n# Title : Bruttoverschuldungsanteil [%]\n# Description : None\n# Issued : 2016-01-21T16:30:35\n# Modified : 2024-10-04T10:13:12\n\ndf = get_dataset('https://www.web.statistik.zh.ch/ogd/data/KANTON_ZUERICH_413.csv')\n\n"}, {"cell_type": "markdown", "id": "4ce1f78f", "metadata": {}, "source": ["## Analyze data"]}, {"cell_type": "code", "execution_count": null, "id": "3e3dab86", "metadata": {}, "outputs": [], "source": ["# drop columns that have no values\n", "df.dropna(how='all', axis=1, inplace=True)"]}, {"cell_type": "code", "execution_count": null, "id": "841bd8d2", "metadata": {}, "outputs": [], "source": ["print(f'The dataset has {df.shape[0]:,.0f} rows (observations) and {df.shape[1]:,.0f} columns (variables).')\n", "print(f'There seem to be {df.duplicated().sum()} exact duplicates in the data.')"]}, {"cell_type": "code", "execution_count": null, "id": "75e73c96", "metadata": {}, "outputs": [], "source": ["df.info(memory_usage='deep', verbose=True)"]}, {"cell_type": "code", "execution_count": null, "id": "02f3df4d", "metadata": {}, "outputs": [], "source": ["df.head()"]}, {"cell_type": "code", "execution_count": null, "id": "a0d7d898", "metadata": {}, "outputs": [], "source": ["# display a small random sample transposed in order to see all variables\n", "df.sample(3).T"]}, {"cell_type": "code", "execution_count": null, "id": "786806dd", "metadata": {}, "outputs": [], "source": ["# describe non-numerical features\n", "try:\n", " with pd.option_context('display.float_format', '{:,.2f}'.format):\n", " display(df.describe(exclude='number'))\n", "except:\n", " print(\"No categorical data in dataset.\")"]}, {"cell_type": "code", "execution_count": null, "id": "e744a6b6", "metadata": {}, "outputs": [], "source": ["# describe numerical features\n", "try:\n", " with pd.option_context('display.float_format', '{:,.2f}'.format):\n", " display(df.describe(include='number'))\n", "except:\n", " print(\"No numercial data in dataset.\")"]}, {"cell_type": "code", "execution_count": null, "id": "7a65d95d", "metadata": {}, "outputs": [], "source": ["# check missing values with missingno\n", "# https://github.com/ResidentMario/missingno\n", "import missingno as msno\n", "msno.matrix(df, labels=True, sort='descending');"]}, {"cell_type": "code", "execution_count": null, "id": "fcc604b7", "metadata": {}, "outputs": [], "source": ["# plot a histogram for each numerical feature\n", "try:\n", " df.hist(bins=25, rwidth=.9)\n", " plt.tight_layout()\n", " plt.show()\n", "except:\n", " print(\"No numercial data to plot.\") "]}, {"cell_type": "code", "execution_count": null, "id": "e13cc6eb", "metadata": {}, "outputs": [], "source": ["# continue your code here..."]}, {"cell_type": "code", "execution_count": null, "id": "18886378", "metadata": {}, "outputs": [], "source": []}, {"cell_type": "code", "execution_count": null, "id": "59fce071", "metadata": {}, "outputs": [], "source": []}, {"cell_type": "code", "execution_count": null, "id": "f42f25aa", "metadata": {}, "outputs": [], "source": []}, {"cell_type": "markdown", "id": "c6b87d83", "metadata": {}, "source": ["**Contact**: Statistisches Amt des Kantons Z\u00fcrich | Data Shop | datashop@statistik.zh.ch"]}], "metadata": {"kernelspec": {"display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3"}, "language_info": {"codemirror_mode": {"name": "ipython", "version": 3}, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.15"}}, "nbformat": 4, "nbformat_minor": 5}