Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a different dataset in the gallery example "examples/gallery/lines/roads.py" #3320

Closed
seisman opened this issue Jul 9, 2024 · 10 comments · Fixed by #3711
Closed

Use a different dataset in the gallery example "examples/gallery/lines/roads.py" #3320

seisman opened this issue Jul 9, 2024 · 10 comments · Fixed by #3711
Assignees
Labels
documentation Improvements or additions to documentation
Milestone

Comments

@seisman
Copy link
Member

seisman commented Jul 9, 2024

The gallery example (https://www.pygmt.org/v0.12.0/gallery/lines/roads.html) uses a dataset from http://www2.census.gov/geo/tiger/TIGER2015/PRISECROADS/tl_2015_15_prisecroads.zip.

I'm getting the "Access Denied" error when I try to download the data using a China IP, but it works when I use a VPN server in US or Japan.

It would be better if we could find another dataset which is more accessible to users.

@seisman seisman added the documentation Improvements or additions to documentation label Jul 9, 2024
@yvonnefroehlich
Copy link
Member

yvonnefroehlich commented Jul 9, 2024

I like the current geopandas line-geometry example. However, I agree that we should use a dataset which is accessible to more [all] users!
Maybe we can find a dataset with line-geometry in the list available at https://geodatasets.readthedocs.io/en/latest/introduction.html#what-is-the-geodatasets-data-object. Sometimes a conversion of the coordinates is needed. I started going though the list (not finished yet) and just picked a dataset with rivers in Europe:

import geopandas as gpd
import pygmt

# -----------------------------------------------------------------------------
gpd_lines = gpd.read_file(
    "https://www.eea.europa.eu/data-and-maps/data/wise-large-rivers-and-large-lakes/zipped-shapefile-with-wise-large-rivers-vector-line/zipped-shapefile-with-wise-large-rivers-vector-line/at_download/file/" + \
    "wise_large_rivers.zip"
)                     

gpd_lines.crs
gpd_lines_new = gpd_lines.to_crs('EPSG:4326')
gpd_lines_new
            
# -----------------------------------------------------------------------------
fig = pygmt.Figure()

fig.coast(
    projection="M10c", 
    region=[-10, 30, 35, 57],
    land="gray99",
    shorelines="1/0.1p,gray50",
    borders="1/0.1,gray30",
    frame=True,
    # rivers="1/1p,lightred",  # Compare with GMT built-in
)

fig.plot(data=gpd_lines_new, pen="0.5p,steelblue")

fig.show()

gpd_lines_rivers

For this dataset, we can [only] filter based on Shape_Leng and use a different pen for each subset, similar as for the different road types:

import geopandas as gpd
import pygmt

# -----------------------------------------------------------------------------
gpd_rivers_org = gpd.read_file(
    "https://www.eea.europa.eu/data-and-maps/data/wise-large-rivers-and-large-lakes/zipped-shapefile-with-wise-large-rivers-vector-line/zipped-shapefile-with-wise-large-rivers-vector-line/at_download/file/" + \
    "wise_large_rivers.zip"
)                     

gpd_rivers = gpd_rivers_org.to_crs('EPSG:4326')
            
# -----------------------------------------------------------------------------
fig = pygmt.Figure()

for i_panel in range(2):
    
    fig.coast(
        projection="M10c", 
        region=[-10, 35, 35, 58],
        land="gray99",
        shorelines="1/0.1p,gray50",
        borders="1/0.01p,gray70",
        frame=True,
    )
    
# -----------------------------------------------------------------------------
    if i_panel==0:
        len_limit = 700000
        gpd_rivers_short = gpd_rivers[gpd_rivers["Shape_Leng"] < len_limit]
        gpd_rivers_long = gpd_rivers[gpd_rivers["Shape_Leng"] > len_limit]
        fig.plot(data=gpd_rivers_short, pen="0.5p,orange", label=f"shorter {len_limit} m")
        fig.plot(data=gpd_rivers_long, pen="0.5p,darkred", label=f"longer {len_limit} m")
        fig.legend()
        
# -----------------------------------------------------------------------------
    if i_panel==1:
        pygmt.makecpt(
            cmap="oslo",
            series=[gpd_rivers.Shape_Leng.min(), 1500000],
            reverse=True,
        )
        for i_river in range(len(gpd_rivers)):
            fig.plot(
                data=gpd_rivers[gpd_rivers.index==i_river],
                zvalue=gpd_rivers.loc[i_river, "Shape_Leng"],
                pen="0.5p",
                cmap=True,
            )
        fig.colorbar(frame=["x+llength", "y+lm"], position="+ef0.2c")
    
# -----------------------------------------------------------------------------
    fig.shift_origin(xshift="w+1.5c")
    
fig.show()

gpd_lines_filter_colorcoding

@seisman
Copy link
Member Author

seisman commented Jul 9, 2024

ping the author of the gallery example @weiji14

@yvonnefroehlich
Copy link
Member

ping the author of the gallery example @weiji14

Hm. Not sure, but looking at PR #1474 it seems like @michaelgrund wrote the first version of this example.

@seisman
Copy link
Member Author

seisman commented Jul 10, 2024

You're right. @michaelgrund is the original author. @weiji14 was dealing with the Okina character ʻ #1474 (comment) recently, which gives me the wrong impression making me think he's the author.

@michaelgrund
Copy link
Member

michaelgrund commented Jul 10, 2024

I'm fine with changing the data resource for this example and really like the rivers dataset @yvonnefroehlich proposed. However:

  1. Is this dataset really available everywhere, at least can you access this @seisman ?
  2. I would only show one figure instead of multiples and ignore filtering or length-related color-coding to keep the example as simple as possible
  3. Not sure if users may be confused because GMT also have built-in rivers to plot

@seisman
Copy link
Member Author

seisman commented Jul 10, 2024

  1. Is this dataset really available everywhere, at least can you access this @seisman ?

Yes, I can access the data. The current dataset in this example is hosted by US government site. I guess that's why it blocks China.

@yvonnefroehlich
Copy link
Member

yvonnefroehlich commented Jul 10, 2024

  1. Is this dataset really available everywhere, at least can you access this @seisman ?

I was hopeing so, as this dataset is provided by the European Union / European Environment Agency (EEA).

  1. I would only show one figure instead of multiples and ignore filtering or length-related color-coding to keep the example as simple as possible

Sure, one figures should be enough to show the principle. I just took the opportunity to play around with the data 😄.

  1. Not sure if users may be confused because GMT also have built-in rivers to plot

I don't think that this is a larger issue. Users may have own datasets with more detailed (or newer) data. If people think it is needed we can maybe include a short comment in the description of the example?

@michaelgrund michaelgrund added this to the 0.14.0 milestone Sep 4, 2024
@seisman seisman removed this from the 0.14.0 milestone Sep 5, 2024
@yvonnefroehlich yvonnefroehlich added this to the 0.14.0 milestone Sep 7, 2024
@seisman seisman removed this from the 0.14.0 milestone Sep 9, 2024
@yvonnefroehlich
Copy link
Member

Hm. What are the plans with this issue? @michaelgrund and @seisman do we want to get a related PR into v0.14.0?

@seisman
Copy link
Member Author

seisman commented Dec 17, 2024

Yes, a PR sounds good, depending on your or @michaelgrund's availability.

@michaelgrund
Copy link
Member

Will work on this next week 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
3 participants