Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

start vignette #19

Merged
merged 3 commits into from
Apr 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,4 @@
.RData
.Ruserdata
docs
inst/doc
8 changes: 7 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,15 @@ Depends:
R (>= 2.10)
Suggests:
dplyr,
ggplot2,
knitr,
lubridate,
pdftools,
rmarkdown,
spelling,
testthat (>= 3.0.0),
tibble
tibble,
tidyr
Encoding: UTF-8
Language: en-GB
LazyData: true
Expand All @@ -34,3 +39,4 @@ Roxygen: list(markdown = TRUE)
URL: https://panukatan.io/bagyo/, https://github.com/panukatan/bagyo
BugReports: https://github.com/panukatan/bagyo/issues
Config/testthat/edition: 3
VignetteBuilder: knitr
3 changes: 2 additions & 1 deletion R/bagyo.R
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@
#' and made available through its website. This package contains Philippine
#' Tropical Cyclone data in a machine-readable format. It is hoped that this
#' data package provides an interesting and unique dataset for data exploration
#' and visualisation.
#' and visualisation as an adjunct to the traditional `iris` dataset and to the
#' current `palmerpenguins` dataset.
#'
#' @docType package
#' @keywords internal
Expand Down
9 changes: 5 additions & 4 deletions R/data.R
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,14 @@
#' | *rsmc_name* | Name given to the tropical cyclone by RSMC |
#' | *start* | Date and time at which cyclone enters Philippine waters |
#' | *end* | Date and time at which cyclone leaves Philippine waters |
#' | *pressure* | Peak central pressure in *hPa* |
#' | *pressure* | Maximum central pressure in *hPa* |
#' | *speed* | Maximum sustained wind speed in *km/h* |
#'
#' @examples
#' tropical_cyclones
#' cyclones
#'
#' @source Data are drawn from PAGASA's Annual Report on Philippine Tropical
#' Cyclones found at https://www.pagasa.dost.gov.ph/tropical-cyclone/publications/annual-report
#' Cyclones found at
#' https://www.pagasa.dost.gov.ph/tropical-cyclone/publications/annual-report
#'
"tropical_cyclones"
"cyclones"
83 changes: 31 additions & 52 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -35,31 +35,15 @@ Oceans and seas significantly impact continental weather, with evaporation from

The Philippines frequently experiences tropical cyclones (called **bagyo** in the Filipino language) because of its geographical position. These cyclones typically bring heavy rainfall, leading to widespread flooding, as well as strong winds that cause significant damage to human life, crops, and property. Data on cyclones are collected and curated by the [Philippine Atmospheric, Geophysical, and Astronomical Services Administration (PAGASA)](https://www.pagasa.dost.gov.ph/).

This package contains Philippine Tropical Cyclone data from 2017 to 2020 in a machine-readable format. It is hoped that this data package provides an interesting and unique dataset for data exploration and visualisation.
This package contains Philippine tropical cyclone data from 2017 to 2020 in a machine-readable format. It is hoped that this data package provides an interesting and unique dataset for data exploration and visualisation as an adjunct to the traditional [`iris`](https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/iris.html) dataset and to the current [`palmerpenguins`](https://allisonhorst.github.io/palmerpenguins/) dataset.

## About the `tropical_cyclones` data
## About the `cyclones` data

The `bagyo` package contains the `tropical_cyclones` dataset. This dataset was taken from annual reports on Philippine tropical cyclones prepared and released by [PAGASA](https://www.pagasa.dost.gov.ph/) at its [website](https://www.pagasa.dost.gov.ph/tropical-cyclone/publications/annual-report) in PDF format.
The `bagyo` package contains the `cyclones` dataset. This dataset was taken from annual reports on Philippine tropical cyclones prepared and released by [PAGASA](https://www.pagasa.dost.gov.ph/) at its [website](https://www.pagasa.dost.gov.ph/tropical-cyclone/publications/annual-report) in PDF format.

Because the reports are in PDF format and the information described above are in tables within the documents, scripts for scraping the desired data were developed and implemented to arrive at the `tropical_cyclones` dataset. The data scraping script can be viewed [here](https://github.com/panukatan/bagyo/blob/main/data-raw/process_data.R).
Because the reports are in PDF format and the information described above are in tables within the documents, scripts for scraping the desired data were developed and implemented to arrive at the `cyclones` dataset. The data scraping script can be viewed [here](https://github.com/panukatan/bagyo/blob/main/data-raw/process_data.R). The `cyclones` metadata can be viewed in R through a call to `?cyclones` in the R console.

The following information is available from the dataset:

**Variable** | **Description**
:--- | :---
*year* | Year
*category_code* | Tropical cyclone category code
*category_name* | Tropical cyclone category name
*name* | Name given to the tropical cyclone by Philippine authorities
*rsmc_name* | Name given to the tropical cyclone by the Regional Specialized Meteorological Centre (RSMC)
*start* | Date and time at which cyclone enters Philippine area of responsibility (PAR)
*end* | Date and time at which cyclone leaves Philippine area of responsibility (PAR)
*pressure* | Peak central pressure in *hPa*
*speed* | Maximum sustained wind speed in *km/h*

This metadata can be viewed in R through a call to `?tropical_cyclones` in the R console.

Whilst tropical cyclones have ravaged the Philippines far earlier than 2017 and more currently than 2020, official and publicly available data for the information described above is only available in the reports for years 2017 to 2020. Earlier documents of this annual reporting pre-2017 have been produced but are not available on the PAGASA website. These reports of the tropical cyclone season (re-started in 2019) are published within two years after the termination of the season. Hence, the most recent report is only up to 2020.
Whilst tropical cyclones have affected the Philippines far earlier than 2017 and more currently than 2020, official and publicly available data for the information described above is only available in the reports for years 2017 to 2020. Earlier documents of this annual reporting pre-2017 have been produced but are not available on the [PAGASA](https://www.pagasa.dost.gov.ph/) website. These reports of the tropical cyclone season (re-started in 2019) are published within two years after the termination of the season. Hence, the most recent report is only up to 2020 for now.

## Installation

Expand All @@ -72,31 +56,24 @@ install.packages(
)
```

Once the `bagyo` package has been installed, the `tropical_cyclones` dataset can be loaded into R as follows:
Once the `bagyo` package has been installed, the `cyclones` dataset can be loaded into R as follows:

```{r load-dataset}
library(bagyo)
data(package = "bagyo")

tropical_cyclones
cyclones
```

## Usage

### Demonstrate tidy data wrangling

#### Tropical cyclones are interesting to summarise
### `cyclones` are interesting to summarise

```{r summary}
library(dplyr)

## Get yearly mean cyclone pressure and speed ----
tropical_cyclones |>
group_by(year) |>
summarise(mean_pressure = mean(pressure), mean_speed = mean(speed))

## Get cyclone category mean pressure and speed ----
tropical_cyclones |>
cyclones |>
group_by(category_name) |>
summarise(
n = n(),
Expand All @@ -105,32 +82,31 @@ tropical_cyclones |>
)
```

#### Tropical cyclones are useful in learning how to work with dates
### `cyclones` are useful in learning how to work with dates

```{r working-with-dates}
## Get cyclone category mean duration (in hours) ----
tropical_cyclones |>
cyclones |>
mutate(duration = end - start) |>
group_by(category_name) |>
summarise(mean_duration = mean(duration))
```

### Demonstrate various `ggplot2` data visualisation geoms
### `cyclones` are great to visualise

#### Bar plots

```{r barplot, echo = FALSE, fig.align = "center", fig.height = 5}
```{r barplot, echo = FALSE, fig.align = "center", fig.height = 4}
library(ggplot2)

## Get cyclone category mean duration (in hours) ----
tropical_cyclones |>
cyclones |>
mutate(duration = end - start) |>
group_by(category_name) |>
summarise(mean_duration = mean(duration)) |>
ggplot(mapping = aes(x = mean_duration, y = category_name)) +
geom_col(fill = "#465b92") +
geom_col(fill = alpha("#4b876e", 0.7)) +
labs(
title = "Mean duration of cyclones by category",
title = "Mean duration of cyclones",
subtitle = "By cyclone categories",
x = "mean duration (hours)",
y = NULL
) +
Expand All @@ -142,34 +118,34 @@ tropical_cyclones |>
)
```

#### Scatter plots

```{r scatterplot, echo = FALSE, fig.align = "center", fig.height = 5, fig.width = 8}
## Cyclone speed by presssure ----
tropical_cyclones |>
cyclones |>
dplyr::mutate(year = factor(year)) |>
ggplot(mapping = aes(x = speed, y = pressure)) +
geom_point(
mapping = aes(colour = category_name), size = 2) +
geom_point(mapping = aes(colour = category_name), size = 3, alpha = 0.7) +
scale_colour_manual(
name = "Category",
name = NULL,
values = c("#9c5e60", "#4b876e", "#465b92", "#e5be72", "#5d0505")
) +
labs(
title = "Cyclone maximum sustained wind speed by maximum central pressure",
subtitle = "Grouped by cyclone categories and year",
x = "Wind speed (km/h)",
y = "Central pressure (hPa)"
title = "Cyclone maximum sustained wind speed and maximum central pressure",
subtitle = "By cyclone categories and year",
x = "wind speed (km/h)",
y = "central pressure (hPa)"
) +
facet_wrap(. ~ year, ncol = 4) +
theme_bw() +
theme(
legend.position = "top",
strip.background = element_rect(
fill = alpha("#465b92", 0.7), colour = "#465b92"
),
panel.border = element_rect(colour = "#465b92"),
panel.grid.minor = element_blank()
)
```


## Citation

If you find the `bagyo` package useful please cite using the suggested citation provided by a call to the `citation()` function as follows:
Expand All @@ -183,3 +159,6 @@ citation("bagyo")
Feedback, bug reports and feature requests are welcome; file issues or seek support [here](https://github.com/panukatan/bagyo/issues). If you would like to contribute to the package, please see our [contributing guidelines](https://panukatan.io/bagyo/CONTRIBUTING.html).

This project is released with a [Contributor Code of Conduct](https://panukatan.io/bagyo/CODE_OF_CONDUCT.html). By participating in this project you agree to abide by its terms.

<br>
<br>
85 changes: 29 additions & 56 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,51 +37,40 @@ collected and curated by the [Philippine Atmospheric, Geophysical, and
Astronomical Services Administration
(PAGASA)](https://www.pagasa.dost.gov.ph/).

This package contains Philippine Tropical Cyclone data from 2017 to 2020
This package contains Philippine tropical cyclone data from 2017 to 2020
in a machine-readable format. It is hoped that this data package
provides an interesting and unique dataset for data exploration and
visualisation.
visualisation as an adjunct to the traditional
[`iris`](https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/iris.html)
dataset and to the current
[`palmerpenguins`](https://allisonhorst.github.io/palmerpenguins/)
dataset.

## About the `tropical_cyclones` data
## About the `cyclones` data

The `bagyo` package contains the `tropical_cyclones` dataset. This
dataset was taken from annual reports on Philippine tropical cyclones
prepared and released by [PAGASA](https://www.pagasa.dost.gov.ph/) at
its
The `bagyo` package contains the `cyclones` dataset. This dataset was
taken from annual reports on Philippine tropical cyclones prepared and
released by [PAGASA](https://www.pagasa.dost.gov.ph/) at its
[website](https://www.pagasa.dost.gov.ph/tropical-cyclone/publications/annual-report)
in PDF format.

Because the reports are in PDF format and the information described
above are in tables within the documents, scripts for scraping the
desired data were developed and implemented to arrive at the
`tropical_cyclones` dataset. The data scraping script can be viewed
desired data were developed and implemented to arrive at the `cyclones`
dataset. The data scraping script can be viewed
[here](https://github.com/panukatan/bagyo/blob/main/data-raw/process_data.R).

The following information is available from the dataset:

| **Variable** | **Description** |
|:----------------|:--------------------------------------------------------------------------------------------|
| *year* | Year |
| *category_code* | Tropical cyclone category code |
| *category_name* | Tropical cyclone category name |
| *name* | Name given to the tropical cyclone by Philippine authorities |
| *rsmc_name* | Name given to the tropical cyclone by the Regional Specialized Meteorological Centre (RSMC) |
| *start* | Date and time at which cyclone enters Philippine area of responsibility (PAR) |
| *end* | Date and time at which cyclone leaves Philippine area of responsibility (PAR) |
| *pressure* | Peak central pressure in *hPa* |
| *speed* | Maximum sustained wind speed in *km/h* |

This metadata can be viewed in R through a call to `?tropical_cyclones`
The `cyclones` metadata can be viewed in R through a call to `?cyclones`
in the R console.

Whilst tropical cyclones have ravaged the Philippines far earlier than
Whilst tropical cyclones have affected the Philippines far earlier than
2017 and more currently than 2020, official and publicly available data
for the information described above is only available in the reports for
years 2017 to 2020. Earlier documents of this annual reporting pre-2017
have been produced but are not available on the PAGASA website. These
reports of the tropical cyclone season (re-started in 2019) are
published within two years after the termination of the season. Hence,
the most recent report is only up to 2020.
have been produced but are not available on the
[PAGASA](https://www.pagasa.dost.gov.ph/) website. These reports of the
tropical cyclone season (re-started in 2019) are published within two
years after the termination of the season. Hence, the most recent report
is only up to 2020 for now.

## Installation

Expand All @@ -95,14 +84,14 @@ install.packages(
)
```

Once the `bagyo` package has been installed, the `tropical_cyclones`
dataset can be loaded into R as follows:
Once the `bagyo` package has been installed, the `cyclones` dataset can
be loaded into R as follows:

``` r
library(bagyo)
data(package = "bagyo")

tropical_cyclones
cyclones
#> # A tibble: 86 × 9
#> year category_code category_name name rsmc_name start
#> <dbl> <fct> <fct> <chr> <chr> <dttm>
Expand All @@ -122,27 +111,13 @@ tropical_cyclones

## Usage

### Demonstrate tidy data wrangling

#### Tropical cyclones are interesting to summarise
### `cyclones` are interesting to summarise

``` r
library(dplyr)

## Get yearly mean cyclone pressure and speed ----
tropical_cyclones |>
group_by(year) |>
summarise(mean_pressure = mean(pressure), mean_speed = mean(speed))
#> # A tibble: 4 × 3
#> year mean_pressure mean_speed
#> <dbl> <dbl> <dbl>
#> 1 2017 986. 88.0
#> 2 2018 961. 66.7
#> 3 2019 976. 59.0
#> 4 2020 973. 62.0

## Get cyclone category mean pressure and speed ----
tropical_cyclones |>
cyclones |>
group_by(category_name) |>
summarise(
n = n(),
Expand All @@ -159,11 +134,11 @@ tropical_cyclones |>
#> 5 Super Typhoon 2 908. 112.
```

#### Tropical cyclones are useful in learning how to work with dates
### `cyclones` are useful in learning how to work with dates

``` r
## Get cyclone category mean duration (in hours) ----
tropical_cyclones |>
cyclones |>
mutate(duration = end - start) |>
group_by(category_name) |>
summarise(mean_duration = mean(duration))
Expand All @@ -177,14 +152,10 @@ tropical_cyclones |>
#> 5 Super Typhoon 77.50000 hours
```

### Demonstrate various `ggplot2` data visualisation geoms

#### Bar plots
### `cyclones` are great to visualise

<img src="man/figures/README-barplot-1.png" style="display: block; margin: auto;" />

#### Scatter plots

<img src="man/figures/README-scatterplot-1.png" style="display: block; margin: auto;" />

## Citation
Expand Down Expand Up @@ -220,3 +191,5 @@ guidelines](https://panukatan.io/bagyo/CONTRIBUTING.html).
This project is released with a [Contributor Code of
Conduct](https://panukatan.io/bagyo/CODE_OF_CONDUCT.html). By
participating in this project you agree to abide by its terms.

<br> <br>
Loading