-
Notifications
You must be signed in to change notification settings - Fork 224
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add a gallery example to show coloring of points by categories (#1006)
Provide gallery example to show coloring of points by categories. Uses the Palmer Penguins example dataset loaded into a pandas DataFrame, with the categorical dtype column converted to int type. Co-authored-by: Wei Ji <23487320+weiji14@users.noreply.github.com> Co-authored-by: Dongdong Tian <seisman.info@gmail.com> Co-authored-by: Yao Jiayuan <coreman.seism@gmail.com>
- Loading branch information
1 parent
b2b3d1f
commit a8a8781
Showing
1 changed file
with
72 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
""" | ||
Color points by categories | ||
--------------------------- | ||
The :meth:`pygmt.Figure.plot` method can be used to plot symbols which are | ||
color-coded by categories. In the example below, we show how the | ||
`Palmer Penguins dataset <https://github.com/allisonhorst/palmerpenguins>`__ | ||
can be visualized. Here, we can pass the individual categories included in | ||
the "species" column directly to the ``color`` parameter via | ||
``color=df.species.cat.codes.astype(int)``. Additionally, we have to set | ||
``cmap=True``. A desired colormap can be selected via the :meth:`pygmt.makecpt` | ||
method. | ||
""" | ||
|
||
import pandas as pd | ||
import pygmt | ||
|
||
# Load sample penguins data and convert 'species' column to categorical dtype | ||
df = pd.read_csv("https://github.com/mwaskom/seaborn-data/raw/master/penguins.csv") | ||
df.species = df.species.astype(dtype="category") | ||
|
||
# Use pygmt.info to get region bounds (xmin, xmax, ymin, ymax) | ||
# The below example will return a numpy array like [30.0, 60.0, 12.0, 22.0] | ||
region = pygmt.info( | ||
table=df[["bill_length_mm", "bill_depth_mm"]], # x and y columns | ||
per_column=True, # report the min/max values per column as a numpy array | ||
# round the min/max values of the first two columns to the nearest multiple | ||
# of 3 and 2, respectively | ||
spacing=(3, 2), | ||
) | ||
|
||
# Make a 2D categorical scatter plot, coloring each of the 3 species differently | ||
fig = pygmt.Figure() | ||
|
||
# Generate a basemap of 10 cm x 10 cm size | ||
fig.basemap( | ||
region=region, | ||
projection="X10c/10c", | ||
frame=[ | ||
'xafg+l"Bill length (mm)"', | ||
'yafg+l"Bill depth (mm)"', | ||
'WSen+t"Penguin size at Palmer Station"', | ||
], | ||
) | ||
|
||
# Define a colormap to be used for three categories, define the range of the | ||
# new discrete CPT using series=(lowest_value, highest_value, interval), | ||
# use color_model="+c" to write the discrete color palette "inferno" in | ||
# categorical format | ||
pygmt.makecpt(cmap="inferno", series=(0, 3, 1), color_model="+c") | ||
|
||
fig.plot( | ||
# Use bill length and bill depth as x and y data input, respectively | ||
x=df.bill_length_mm, | ||
y=df.bill_depth_mm, | ||
# Vary each symbol size according to another feature (body mass, scaled by 7.5*10e-5) | ||
sizes=df.body_mass_g * 7.5e-5, | ||
# Points colored by categorical number code | ||
color=df.species.cat.codes.astype(int), | ||
# Use colormap created by makecpt | ||
cmap=True, | ||
# Do not clip symbols that fall close to the map bounds | ||
no_clip=True, | ||
# Use circles as symbols with size in centimeter units | ||
style="cc", | ||
# Set transparency level for all symbols to deal with overplotting | ||
transparency=40, | ||
) | ||
|
||
# A colorbar displaying the different penguin species types will be added | ||
# once GMT 6.2.0 is released. | ||
|
||
fig.show() |