diff --git a/.gitignore b/.gitignore index 234f028..0d7f03b 100644 --- a/.gitignore +++ b/.gitignore @@ -3,3 +3,4 @@ .RData .Ruserdata docs +inst/doc diff --git a/DESCRIPTION b/DESCRIPTION index 2fab087..1ccb212 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -22,10 +22,15 @@ Depends: R (>= 2.10) Suggests: dplyr, + ggplot2, + knitr, + lubridate, pdftools, + rmarkdown, spelling, testthat (>= 3.0.0), - tibble + tibble, + tidyr Encoding: UTF-8 Language: en-GB LazyData: true @@ -34,3 +39,4 @@ Roxygen: list(markdown = TRUE) URL: https://panukatan.io/bagyo/, https://github.com/panukatan/bagyo BugReports: https://github.com/panukatan/bagyo/issues Config/testthat/edition: 3 +VignetteBuilder: knitr diff --git a/R/bagyo.R b/R/bagyo.R index 0764f2e..a2195bd 100644 --- a/R/bagyo.R +++ b/R/bagyo.R @@ -10,7 +10,8 @@ #' and made available through its website. This package contains Philippine #' Tropical Cyclone data in a machine-readable format. It is hoped that this #' data package provides an interesting and unique dataset for data exploration -#' and visualisation. +#' and visualisation as an adjunct to the traditional `iris` dataset and to the +#' current `palmerpenguins` dataset. #' #' @docType package #' @keywords internal diff --git a/R/data.R b/R/data.R index 3abab69..d83ff2a 100644 --- a/R/data.R +++ b/R/data.R @@ -12,13 +12,14 @@ #' | *rsmc_name* | Name given to the tropical cyclone by RSMC | #' | *start* | Date and time at which cyclone enters Philippine waters | #' | *end* | Date and time at which cyclone leaves Philippine waters | -#' | *pressure* | Peak central pressure in *hPa* | +#' | *pressure* | Maximum central pressure in *hPa* | #' | *speed* | Maximum sustained wind speed in *km/h* | #' #' @examples -#' tropical_cyclones +#' cyclones #' #' @source Data are drawn from PAGASA's Annual Report on Philippine Tropical -#' Cyclones found at https://www.pagasa.dost.gov.ph/tropical-cyclone/publications/annual-report +#' Cyclones found at +#' https://www.pagasa.dost.gov.ph/tropical-cyclone/publications/annual-report #' -"tropical_cyclones" +"cyclones" diff --git a/README.Rmd b/README.Rmd index cc77f19..bb56471 100644 --- a/README.Rmd +++ b/README.Rmd @@ -35,31 +35,15 @@ Oceans and seas significantly impact continental weather, with evaporation from The Philippines frequently experiences tropical cyclones (called **bagyo** in the Filipino language) because of its geographical position. These cyclones typically bring heavy rainfall, leading to widespread flooding, as well as strong winds that cause significant damage to human life, crops, and property. Data on cyclones are collected and curated by the [Philippine Atmospheric, Geophysical, and Astronomical Services Administration (PAGASA)](https://www.pagasa.dost.gov.ph/). -This package contains Philippine Tropical Cyclone data from 2017 to 2020 in a machine-readable format. It is hoped that this data package provides an interesting and unique dataset for data exploration and visualisation. +This package contains Philippine tropical cyclone data from 2017 to 2020 in a machine-readable format. It is hoped that this data package provides an interesting and unique dataset for data exploration and visualisation as an adjunct to the traditional [`iris`](https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/iris.html) dataset and to the current [`palmerpenguins`](https://allisonhorst.github.io/palmerpenguins/) dataset. -## About the `tropical_cyclones` data +## About the `cyclones` data -The `bagyo` package contains the `tropical_cyclones` dataset. This dataset was taken from annual reports on Philippine tropical cyclones prepared and released by [PAGASA](https://www.pagasa.dost.gov.ph/) at its [website](https://www.pagasa.dost.gov.ph/tropical-cyclone/publications/annual-report) in PDF format. +The `bagyo` package contains the `cyclones` dataset. This dataset was taken from annual reports on Philippine tropical cyclones prepared and released by [PAGASA](https://www.pagasa.dost.gov.ph/) at its [website](https://www.pagasa.dost.gov.ph/tropical-cyclone/publications/annual-report) in PDF format. -Because the reports are in PDF format and the information described above are in tables within the documents, scripts for scraping the desired data were developed and implemented to arrive at the `tropical_cyclones` dataset. The data scraping script can be viewed [here](https://github.com/panukatan/bagyo/blob/main/data-raw/process_data.R). +Because the reports are in PDF format and the information described above are in tables within the documents, scripts for scraping the desired data were developed and implemented to arrive at the `cyclones` dataset. The data scraping script can be viewed [here](https://github.com/panukatan/bagyo/blob/main/data-raw/process_data.R). The `cyclones` metadata can be viewed in R through a call to `?cyclones` in the R console. -The following information is available from the dataset: - -**Variable** | **Description** -:--- | :--- -*year* | Year -*category_code* | Tropical cyclone category code -*category_name* | Tropical cyclone category name -*name* | Name given to the tropical cyclone by Philippine authorities -*rsmc_name* | Name given to the tropical cyclone by the Regional Specialized Meteorological Centre (RSMC) -*start* | Date and time at which cyclone enters Philippine area of responsibility (PAR) -*end* | Date and time at which cyclone leaves Philippine area of responsibility (PAR) -*pressure* | Peak central pressure in *hPa* -*speed* | Maximum sustained wind speed in *km/h* - -This metadata can be viewed in R through a call to `?tropical_cyclones` in the R console. - -Whilst tropical cyclones have ravaged the Philippines far earlier than 2017 and more currently than 2020, official and publicly available data for the information described above is only available in the reports for years 2017 to 2020. Earlier documents of this annual reporting pre-2017 have been produced but are not available on the PAGASA website. These reports of the tropical cyclone season (re-started in 2019) are published within two years after the termination of the season. Hence, the most recent report is only up to 2020. +Whilst tropical cyclones have affected the Philippines far earlier than 2017 and more currently than 2020, official and publicly available data for the information described above is only available in the reports for years 2017 to 2020. Earlier documents of this annual reporting pre-2017 have been produced but are not available on the [PAGASA](https://www.pagasa.dost.gov.ph/) website. These reports of the tropical cyclone season (re-started in 2019) are published within two years after the termination of the season. Hence, the most recent report is only up to 2020 for now. ## Installation @@ -72,31 +56,24 @@ install.packages( ) ``` -Once the `bagyo` package has been installed, the `tropical_cyclones` dataset can be loaded into R as follows: +Once the `bagyo` package has been installed, the `cyclones` dataset can be loaded into R as follows: ```{r load-dataset} library(bagyo) data(package = "bagyo") -tropical_cyclones +cyclones ``` ## Usage -### Demonstrate tidy data wrangling - -#### Tropical cyclones are interesting to summarise +### `cyclones` are interesting to summarise ```{r summary} library(dplyr) -## Get yearly mean cyclone pressure and speed ---- -tropical_cyclones |> - group_by(year) |> - summarise(mean_pressure = mean(pressure), mean_speed = mean(speed)) - ## Get cyclone category mean pressure and speed ---- -tropical_cyclones |> +cyclones |> group_by(category_name) |> summarise( n = n(), @@ -105,32 +82,31 @@ tropical_cyclones |> ) ``` -#### Tropical cyclones are useful in learning how to work with dates +### `cyclones` are useful in learning how to work with dates ```{r working-with-dates} ## Get cyclone category mean duration (in hours) ---- -tropical_cyclones |> +cyclones |> mutate(duration = end - start) |> group_by(category_name) |> summarise(mean_duration = mean(duration)) ``` -### Demonstrate various `ggplot2` data visualisation geoms +### `cyclones` are great to visualise -#### Bar plots - -```{r barplot, echo = FALSE, fig.align = "center", fig.height = 5} +```{r barplot, echo = FALSE, fig.align = "center", fig.height = 4} library(ggplot2) ## Get cyclone category mean duration (in hours) ---- -tropical_cyclones |> +cyclones |> mutate(duration = end - start) |> group_by(category_name) |> summarise(mean_duration = mean(duration)) |> ggplot(mapping = aes(x = mean_duration, y = category_name)) + - geom_col(fill = "#465b92") + + geom_col(fill = alpha("#4b876e", 0.7)) + labs( - title = "Mean duration of cyclones by category", + title = "Mean duration of cyclones", + subtitle = "By cyclone categories", x = "mean duration (hours)", y = NULL ) + @@ -142,34 +118,34 @@ tropical_cyclones |> ) ``` -#### Scatter plots - ```{r scatterplot, echo = FALSE, fig.align = "center", fig.height = 5, fig.width = 8} ## Cyclone speed by presssure ---- -tropical_cyclones |> +cyclones |> dplyr::mutate(year = factor(year)) |> ggplot(mapping = aes(x = speed, y = pressure)) + - geom_point( - mapping = aes(colour = category_name), size = 2) + + geom_point(mapping = aes(colour = category_name), size = 3, alpha = 0.7) + scale_colour_manual( - name = "Category", + name = NULL, values = c("#9c5e60", "#4b876e", "#465b92", "#e5be72", "#5d0505") ) + labs( - title = "Cyclone maximum sustained wind speed by maximum central pressure", - subtitle = "Grouped by cyclone categories and year", - x = "Wind speed (km/h)", - y = "Central pressure (hPa)" + title = "Cyclone maximum sustained wind speed and maximum central pressure", + subtitle = "By cyclone categories and year", + x = "wind speed (km/h)", + y = "central pressure (hPa)" ) + facet_wrap(. ~ year, ncol = 4) + theme_bw() + theme( legend.position = "top", + strip.background = element_rect( + fill = alpha("#465b92", 0.7), colour = "#465b92" + ), + panel.border = element_rect(colour = "#465b92"), panel.grid.minor = element_blank() ) ``` - ## Citation If you find the `bagyo` package useful please cite using the suggested citation provided by a call to the `citation()` function as follows: @@ -183,3 +159,6 @@ citation("bagyo") Feedback, bug reports and feature requests are welcome; file issues or seek support [here](https://github.com/panukatan/bagyo/issues). If you would like to contribute to the package, please see our [contributing guidelines](https://panukatan.io/bagyo/CONTRIBUTING.html). This project is released with a [Contributor Code of Conduct](https://panukatan.io/bagyo/CODE_OF_CONDUCT.html). By participating in this project you agree to abide by its terms. + +
+
diff --git a/README.md b/README.md index e026b6f..6998f81 100644 --- a/README.md +++ b/README.md @@ -37,51 +37,40 @@ collected and curated by the [Philippine Atmospheric, Geophysical, and Astronomical Services Administration (PAGASA)](https://www.pagasa.dost.gov.ph/). -This package contains Philippine Tropical Cyclone data from 2017 to 2020 +This package contains Philippine tropical cyclone data from 2017 to 2020 in a machine-readable format. It is hoped that this data package provides an interesting and unique dataset for data exploration and -visualisation. +visualisation as an adjunct to the traditional +[`iris`](https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/iris.html) +dataset and to the current +[`palmerpenguins`](https://allisonhorst.github.io/palmerpenguins/) +dataset. -## About the `tropical_cyclones` data +## About the `cyclones` data -The `bagyo` package contains the `tropical_cyclones` dataset. This -dataset was taken from annual reports on Philippine tropical cyclones -prepared and released by [PAGASA](https://www.pagasa.dost.gov.ph/) at -its +The `bagyo` package contains the `cyclones` dataset. This dataset was +taken from annual reports on Philippine tropical cyclones prepared and +released by [PAGASA](https://www.pagasa.dost.gov.ph/) at its [website](https://www.pagasa.dost.gov.ph/tropical-cyclone/publications/annual-report) in PDF format. Because the reports are in PDF format and the information described above are in tables within the documents, scripts for scraping the -desired data were developed and implemented to arrive at the -`tropical_cyclones` dataset. The data scraping script can be viewed +desired data were developed and implemented to arrive at the `cyclones` +dataset. The data scraping script can be viewed [here](https://github.com/panukatan/bagyo/blob/main/data-raw/process_data.R). - -The following information is available from the dataset: - -| **Variable** | **Description** | -|:----------------|:--------------------------------------------------------------------------------------------| -| *year* | Year | -| *category_code* | Tropical cyclone category code | -| *category_name* | Tropical cyclone category name | -| *name* | Name given to the tropical cyclone by Philippine authorities | -| *rsmc_name* | Name given to the tropical cyclone by the Regional Specialized Meteorological Centre (RSMC) | -| *start* | Date and time at which cyclone enters Philippine area of responsibility (PAR) | -| *end* | Date and time at which cyclone leaves Philippine area of responsibility (PAR) | -| *pressure* | Peak central pressure in *hPa* | -| *speed* | Maximum sustained wind speed in *km/h* | - -This metadata can be viewed in R through a call to `?tropical_cyclones` +The `cyclones` metadata can be viewed in R through a call to `?cyclones` in the R console. -Whilst tropical cyclones have ravaged the Philippines far earlier than +Whilst tropical cyclones have affected the Philippines far earlier than 2017 and more currently than 2020, official and publicly available data for the information described above is only available in the reports for years 2017 to 2020. Earlier documents of this annual reporting pre-2017 -have been produced but are not available on the PAGASA website. These -reports of the tropical cyclone season (re-started in 2019) are -published within two years after the termination of the season. Hence, -the most recent report is only up to 2020. +have been produced but are not available on the +[PAGASA](https://www.pagasa.dost.gov.ph/) website. These reports of the +tropical cyclone season (re-started in 2019) are published within two +years after the termination of the season. Hence, the most recent report +is only up to 2020 for now. ## Installation @@ -95,14 +84,14 @@ install.packages( ) ``` -Once the `bagyo` package has been installed, the `tropical_cyclones` -dataset can be loaded into R as follows: +Once the `bagyo` package has been installed, the `cyclones` dataset can +be loaded into R as follows: ``` r library(bagyo) data(package = "bagyo") -tropical_cyclones +cyclones #> # A tibble: 86 × 9 #> year category_code category_name name rsmc_name start #> @@ -122,27 +111,13 @@ tropical_cyclones ## Usage -### Demonstrate tidy data wrangling - -#### Tropical cyclones are interesting to summarise +### `cyclones` are interesting to summarise ``` r library(dplyr) -## Get yearly mean cyclone pressure and speed ---- -tropical_cyclones |> - group_by(year) |> - summarise(mean_pressure = mean(pressure), mean_speed = mean(speed)) -#> # A tibble: 4 × 3 -#> year mean_pressure mean_speed -#> -#> 1 2017 986. 88.0 -#> 2 2018 961. 66.7 -#> 3 2019 976. 59.0 -#> 4 2020 973. 62.0 - ## Get cyclone category mean pressure and speed ---- -tropical_cyclones |> +cyclones |> group_by(category_name) |> summarise( n = n(), @@ -159,11 +134,11 @@ tropical_cyclones |> #> 5 Super Typhoon 2 908. 112. ``` -#### Tropical cyclones are useful in learning how to work with dates +### `cyclones` are useful in learning how to work with dates ``` r ## Get cyclone category mean duration (in hours) ---- -tropical_cyclones |> +cyclones |> mutate(duration = end - start) |> group_by(category_name) |> summarise(mean_duration = mean(duration)) @@ -177,14 +152,10 @@ tropical_cyclones |> #> 5 Super Typhoon 77.50000 hours ``` -### Demonstrate various `ggplot2` data visualisation geoms - -#### Bar plots +### `cyclones` are great to visualise -#### Scatter plots - ## Citation @@ -220,3 +191,5 @@ guidelines](https://panukatan.io/bagyo/CONTRIBUTING.html). This project is released with a [Contributor Code of Conduct](https://panukatan.io/bagyo/CODE_OF_CONDUCT.html). By participating in this project you agree to abide by its terms. + +

diff --git a/data-raw/process_data.R b/data-raw/process_data.R index a7b17d4..d061557 100644 --- a/data-raw/process_data.R +++ b/data-raw/process_data.R @@ -293,7 +293,7 @@ set1_2020 <- df_2020 |> .fn = function(x) c("domestic_name", "international_name", "international_code", "warning_start_date", "warning_start_time", "warning_end_date", - "warning_end_time", "peak_pressure", "peak_speed", "peak_date", + "warning_end_time", "peak_speed", "peak_pressure", "peak_date", "peak_time") ) |> dplyr::mutate( @@ -373,7 +373,7 @@ df_2020 <- set1_2020 |> ## Concatenate ---- -tropical_cyclones <- rbind(df_2017, df_2018, df_2019, df_2020) |> +cyclones <- rbind(df_2017, df_2018, df_2019, df_2020) |> dplyr::mutate( year = lubridate::year(warning_start), .before = category_code ) |> @@ -387,6 +387,6 @@ tropical_cyclones <- rbind(df_2017, df_2018, df_2019, df_2020) |> ) ## Export data ---- -usethis::use_data(tropical_cyclones, overwrite = TRUE, compress = "xz") +usethis::use_data(cyclones, overwrite = TRUE, compress = "xz") diff --git a/data/cyclones.rda b/data/cyclones.rda new file mode 100644 index 0000000..8783f8b Binary files /dev/null and b/data/cyclones.rda differ diff --git a/data/tropical_cyclones.rda b/data/tropical_cyclones.rda deleted file mode 100644 index 22b0477..0000000 Binary files a/data/tropical_cyclones.rda and /dev/null differ diff --git a/man/bagyo.Rd b/man/bagyo.Rd index 907cfba..a78395f 100644 --- a/man/bagyo.Rd +++ b/man/bagyo.Rd @@ -15,7 +15,8 @@ Atmospheric, Geophysical, and Astronomical Services Administration (PAGASA) and made available through its website. This package contains Philippine Tropical Cyclone data in a machine-readable format. It is hoped that this data package provides an interesting and unique dataset for data exploration -and visualisation. +and visualisation as an adjunct to the traditional \code{iris} dataset and to the +current \code{palmerpenguins} dataset. } \seealso{ Useful links: diff --git a/man/tropical_cyclones.Rd b/man/cyclones.Rd similarity index 81% rename from man/tropical_cyclones.Rd rename to man/cyclones.Rd index 4537410..d148f4f 100644 --- a/man/tropical_cyclones.Rd +++ b/man/cyclones.Rd @@ -1,8 +1,8 @@ % Generated by roxygen2: do not edit by hand % Please edit documentation in R/data.R \docType{data} -\name{tropical_cyclones} -\alias{tropical_cyclones} +\name{cyclones} +\alias{cyclones} \title{Table of tropical cyclones that entered the Philippines from 2017 to 2020s} \format{ A data frame with 9 columns and 86 rows:\tabular{ll}{ @@ -14,22 +14,23 @@ A data frame with 9 columns and 86 rows:\tabular{ll}{ \emph{rsmc_name} \tab Name given to the tropical cyclone by RSMC \cr \emph{start} \tab Date and time at which cyclone enters Philippine waters \cr \emph{end} \tab Date and time at which cyclone leaves Philippine waters \cr - \emph{pressure} \tab Peak central pressure in \emph{hPa} \cr + \emph{pressure} \tab Maximum central pressure in \emph{hPa} \cr \emph{speed} \tab Maximum sustained wind speed in \emph{km/h} \cr } } \source{ Data are drawn from PAGASA's Annual Report on Philippine Tropical -Cyclones found at https://www.pagasa.dost.gov.ph/tropical-cyclone/publications/annual-report +Cyclones found at +https://www.pagasa.dost.gov.ph/tropical-cyclone/publications/annual-report } \usage{ -tropical_cyclones +cyclones } \description{ Table of tropical cyclones that entered the Philippines from 2017 to 2020s } \examples{ -tropical_cyclones +cyclones } \keyword{datasets} diff --git a/man/figures/README-barplot-1.png b/man/figures/README-barplot-1.png index 80461cd..c7c93ec 100644 Binary files a/man/figures/README-barplot-1.png and b/man/figures/README-barplot-1.png differ diff --git a/man/figures/README-scatterplot-1.png b/man/figures/README-scatterplot-1.png index 44e28c7..989722d 100644 Binary files a/man/figures/README-scatterplot-1.png and b/man/figures/README-scatterplot-1.png differ diff --git a/pkgdown/_pkgdown.yml b/pkgdown/_pkgdown.yml index 4669cf2..1557eb9 100644 --- a/pkgdown/_pkgdown.yml +++ b/pkgdown/_pkgdown.yml @@ -44,6 +44,6 @@ reference: - title: Datasets contents: - - tropical_cyclones + - cyclones diff --git a/tests/testthat/test-data.R b/tests/testthat/test-data.R index 07cfd7b..1e371ef 100644 --- a/tests/testthat/test-data.R +++ b/tests/testthat/test-data.R @@ -1,11 +1,11 @@ # Tests for data --------------------------------------------------------------- -testthat::expect_s3_class(tropical_cyclones, "data.frame") +testthat::expect_s3_class(cyclones, "data.frame") testthat::expect_named( - tropical_cyclones, + cyclones, expected = c("year", "category_code", "category_name", "name", "rsmc_name", "start", "end", "pressure", "speed") ) -testthat::expect_contains(tropical_cyclones$year, 2017:2020) +testthat::expect_contains(cyclones$year, 2017:2020) diff --git a/vignettes/.gitignore b/vignettes/.gitignore new file mode 100644 index 0000000..097b241 --- /dev/null +++ b/vignettes/.gitignore @@ -0,0 +1,2 @@ +*.html +*.R diff --git a/vignettes/bagyo.Rmd b/vignettes/bagyo.Rmd new file mode 100644 index 0000000..967ad38 --- /dev/null +++ b/vignettes/bagyo.Rmd @@ -0,0 +1,192 @@ +--- +title: "Introduction to bagyo" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Introduction to bagyo} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r, include = FALSE} +knitr::opts_chunk$set( + message = FALSE, + warning = FALSE, + collapse = TRUE, + comment = "#>" +) +``` + +```{r setup, echo = FALSE} +library(bagyo) +``` + +The `bagyo` package contains the `cyclones` dataset. This dataset was taken from annual reports on Philippine tropical cyclones prepared and released by [PAGASA](https://www.pagasa.dost.gov.ph/) at its [website](https://www.pagasa.dost.gov.ph/tropical-cyclone/publications/annual-report) in PDF format. + +Because the reports are in PDF format and the information described above are in tables within the documents, scripts for scraping the desired data were developed and implemented to arrive at the `cyclones` dataset. The data scraping script can be viewed [here](https://github.com/panukatan/bagyo/blob/main/data-raw/process_data.R). + +The following information is available from the dataset: + +**Variable** | **Description** +:--- | :--- +*year* | Year +*category_code* | Tropical cyclone category code +*category_name* | Tropical cyclone category name +*name* | Name given to the tropical cyclone by Philippine authorities +*rsmc_name* | Name given to the tropical cyclone by the Regional Specialized Meteorological Centre (RSMC) +*start* | Date and time at which cyclone enters Philippine area of responsibility (PAR) +*end* | Date and time at which cyclone leaves Philippine area of responsibility (PAR) +*pressure* | Maximum central pressure in *hPa* +*speed* | Maximum sustained wind speed in *km/h* + +This metadata can be viewed in R through a call to `?cyclones` in the R console. + +Whilst tropical cyclones have affected the Philippines far earlier than 2017 and more currently than 2020, official and publicly available data for the information described above is only available in the reports for years 2017 to 2020. Earlier documents of this annual reporting pre-2017 have been produced but are not available on the [PAGASA](https://www.pagasa.dost.gov.ph/) website. These reports of the tropical cyclone season (re-started in 2019) are published within two years after the termination of the season. Hence, the most recent report is only up to 2020 for now. + +It is expected that reports for 2021 onwards will continue to be published and made available by PAGASA. As such, the `bagyo` package and the `cyclones` dataset within it will be updated accordingly. Continued efforts are also being taken to find sources of information for years preceding 2017. + +This introductory vignette documents the utility of the `cyclones` dataset in the `bagyo` package in general statistics and data science teaching and for software documentation and testing. + +## Data wrangling + +Following are some highlight examples of how the `cyclones` dataset can be used to demonstrate various data wrangling approaches, particularly those using the `tidyverse` packages. + +### Creating summaries + +```{r summary} +library(dplyr) +library(tidyr) + +## Get number of cyclone categories per year ---- +cyclones |> + group_by(year, category_name) |> + count() |> + group_by(year) |> + complete(category_name) |> + ungroup() + +## Get yearly mean cyclone pressure and speed ---- +cyclones |> + group_by(year) |> + summarise(mean_pressure = mean(pressure), mean_speed = mean(speed)) + +## Get cyclone category mean pressure and speed ---- +cyclones |> + group_by(category_name) |> + summarise( + n = n(), + mean_pressure = mean(pressure), + mean_speed = mean(speed) + ) +``` + +### Working with date and time data + +```{r working-with-dates} +library(lubridate) + +## Get cyclone category mean duration (in hours) ---- +cyclones |> + mutate(duration = end - start) |> + group_by(category_name) |> + summarise(mean_duration = mean(duration)) + +## Get number of cyclones per month by year ---- +cyclones |> + mutate(month = month(start, label = TRUE)) |> + group_by(month, year) |> + count() |> + ungroup() |> + complete(month, year, fill = list(n = 0)) |> + arrange(year, month) +``` + +## Data visualisation + +### Bar plots + +```{r barplot, echo = FALSE, fig.align = "center", fig.height = 4} +library(ggplot2) + +## Get cyclone category mean duration (in hours) ---- +cyclones |> + mutate(duration = end - start) |> + group_by(category_name) |> + summarise(mean_duration = mean(duration)) |> + ggplot(mapping = aes(x = mean_duration, y = category_name)) + + geom_col(fill = alpha("#4b876e", 0.7)) + + labs( + title = "Mean duration of cyclones", + subtitle = "By cyclone categories", + x = "mean duration (hours)", + y = NULL + ) + + theme_minimal() + + theme( + panel.grid.minor.x = element_blank(), + panel.grid.major.y = element_blank(), + panel.grid.minor.y = element_blank() + ) +``` + +### Scatter plots + +```{r scatterplot, echo = FALSE, fig.align = "center", fig.height = 5, fig.width = 8} +## Cyclone speed by presssure ---- +cyclones |> + dplyr::mutate(year = factor(year)) |> + ggplot(mapping = aes(x = speed, y = pressure)) + + geom_point(mapping = aes(colour = category_name), size = 3, alpha = 0.7) + + scale_colour_manual( + name = NULL, + values = c("#9c5e60", "#4b876e", "#465b92", "#e5be72", "#5d0505") + ) + + labs( + title = "Cyclone maximum sustained wind speed and maximum central pressure", + subtitle = "By cyclone categories and year", + x = "wind speed (km/h)", + y = "central pressure (hPa)" + ) + + facet_wrap(. ~ year, ncol = 4) + + theme_bw() + + theme( + legend.position = "top", + strip.background = element_rect( + fill = alpha("#465b92", 0.7), colour = "#465b92" + ), + panel.border = element_rect(colour = "#465b92"), + panel.grid.minor = element_blank() + ) +``` + +## Time series + +```{r time-series, fig.align = "center", fig.height = 4, fig.width = 8} +## Get number of cyclones per month by year and plot ---- +cyclones |> + mutate(month = month(start, label = TRUE)) |> + group_by(month, year) |> + count() |> + ungroup() |> + complete(month, year, fill = list(n = 0)) |> + arrange(year, month) |> + ggplot(mapping = aes(x = month, y = n)) + + geom_col(fill = alpha("#4b876e", 0.7)) + + scale_y_continuous(breaks = seq(from = 0, to = 6, by = 1)) + + labs( + title = "Number of cyclones over time", + subtitle = "2017-2020", + x = NULL, + y = "n" + ) + + facet_wrap(. ~ year, ncol = 4) + + theme_bw() + + theme( + strip.background = element_rect( + fill = alpha("#465b92", 0.7), colour = "#465b92" + ), + panel.border = element_rect(colour = "#465b92"), + panel.grid.minor.y = element_blank(), + panel.grid.major.x = element_blank(), + axis.text.x = element_text(size = 10, angle = 90, hjust = 1, vjust = 0.5) + ) +```