Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Which data to use for geopandas.GeoDataFrame choropleth example? #2786

Closed
michaelgrund opened this issue Oct 31, 2023 · 4 comments
Closed

Which data to use for geopandas.GeoDataFrame choropleth example? #2786

michaelgrund opened this issue Oct 31, 2023 · 4 comments
Labels
question Further information is requested
Milestone

Comments

@michaelgrund
Copy link
Member

I wanted to continue on the choropleth example mentioned in #1374, however, in the meanwhile
world = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres")) raises a deprection warning saying the datasets will be removed in geopandas v1.0 (see discussion in #2751).

Thus, we need to use another data source. Any ideas what can be used without to much effort and without adding further dependecies to PyGMT?

@michaelgrund michaelgrund added the question Further information is requested label Oct 31, 2023
@seisman
Copy link
Member

seisman commented Oct 31, 2023

Maybe read a URL directly?

world = gpd.read_file("https://naciscdn.org/naturalearth/110m/physical/ne_110m_land.zip")

@michaelgrund
Copy link
Member Author

Maybe read a URL directly?

world = gpd.read_file("https://naciscdn.org/naturalearth/110m/physical/ne_110m_land.zip")

Yes that works but unfortunately the fetched data does neither contain population values nor it's possible to filter e.g. for a specific continent/region like "Europe".

@yvonnefroehlich
Copy link
Member

I faced this issue when starting to work on possible figures for the PyGMT paper.

So far I understood that there is now a separate repro from the geopandas organization for datasets: geodatasets https://github.com/geopandas/geodatasets and https://geodatasets.readthedocs.io/en/latest/.

These datasets can be accessed:

import geopandas
import geodatasets

airbnb = geopandas.read_file(geodatasets.get_path("geoda.airbnb"))

It is possible to get more information about these datasets (please see https://geodatasets.readthedocs.io/en/latest/#how-to-use):

import geodatasets
geodatasets.data

Gives:

{'geoda': {'airbnb': {'url': 'https://geodacenter.github.io/data-and-lab//data/airbnb.zip',
   'license': 'NA',
   'attribution': 'Center for Spatial Data Science, University of Chicago',
   'name': 'geoda.airbnb',
   'description': 'Airbnb rentals, socioeconomics, and crime in Chicago',
   'geometry_type': 'Polygon',
   'nrows': 77,
   'ncols': 21,
   'details': 'https://geodacenter.github.io/data-and-lab//airbnb/',
   'hash': 'a2ab1e3f938226d287dd76cde18c00e2d3a260640dd826da7131827d9e76c824',
   'filename': 'airbnb.zip'},
  'atlanta': {'url': 'https://geodacenter.github.io/data-and-lab//data/atlanta_hom.zip',
   'license': 'NA',
   'attribution': 'Center for Spatial Data Science, University of Chicago',
   'name': 'geoda.atlanta',
   'description': 'Atlanta, GA region homicide counts and rates',
   'geometry_type': 'Polygon',
   'nrows': 90,
   'ncols': 24,
   'details': 'https://geodacenter.github.io/data-and-lab//atlanta_old/',
   'hash': 'a33a76e12168fe84361e60c88a9df4856730487305846c559715c89b1a2b5e09',
   'filename': 'atlanta_hom.zip',
   'members': ['atlanta_hom/atl_hom.geojson']},
   ...

There also the complete URL can be found (directly via geodatasets.data.geoda.airbnb or geodatasets.get_url("geoda.airbnb")). The URL can be passed directly to geopandas.read_file(). So, if you know the URL of the dataset, it is not needed anymore to have geodatasets installed.

The dataset is read into a GeoDataFrame and can then be used for analysis and plotting, e.g., in PyGMT. I just tried this with the first dataset listed by geodatasets.data:

import geopandas as gpd
import numpy as np
import pygmt

airbnb = gpd.read_file("https://geodacenter.github.io/data-and-lab//data/airbnb.zip")

fig = pygmt.Figure()

fig.coast(
    region=[-88, -87.2, 41.6, 42.05],
    projection="M10c", 
    frame=True,
    water="lightblue",
    land="gray90",
    shorelines="1/1p,gray30",
)

pygmt.makecpt(
    cmap="bilbao",
    series=[np.min(airbnb["population"]), np.max(airbnb["population"]), 10],
    continuous=True,
)

fig.plot(
    data=airbnb[["population","geometry"]], 
    pen="0.2p,gray10", 
    close=True, 
    fill="+z", 
    cmap=True,
    aspatial="Z=population",
)

fig.colorbar(frame="x+lPopulation")

fig.show()
# fig.savefig(fname="choropleth_airbnb_population.png")

Output figure:
choropleth_airbnb_population

So, @michaelgrund maybe you can find in the overview at https://geodatasets.readthedocs.io/en/latest/introduction.html#what-is-the-geodatasets-data-object an interesting dataset which you can use for your gallery example?

@weiji14 weiji14 added the documentation Improvements or additions to documentation label Oct 31, 2023
@seisman seisman added this to the 0.11.0 milestone Nov 1, 2023
@michaelgrund
Copy link
Member Author

I didn't realize that I could load the file directly. I thought you also need to include the geodatasets library, which would have been a new dependency. Thanks @yvonnefroehlich. Just modified your example a little, see #2796.

@seisman seisman removed the documentation Improvements or additions to documentation label Nov 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants