Skip to content

Commit

Permalink
make all markdown pandoc compliant
Browse files Browse the repository at this point in the history
  • Loading branch information
lorenzwalthert committed Apr 18, 2022
1 parent 3edbae8 commit 17800da
Show file tree
Hide file tree
Showing 16 changed files with 975 additions and 1,212 deletions.
18 changes: 12 additions & 6 deletions LICENSE.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,19 @@
---
editor_options:
markdown:
wrap: 79
---

# MIT License

Copyright (c) 2021 styler authors

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
Expand Down
1,117 changes: 603 additions & 514 deletions NEWS.md

Large diffs are not rendered by default.

41 changes: 24 additions & 17 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@
output:
github_document:
html_preview: true
editor_options:
markdown:
wrap: 79
---

<!-- README.md is generated from README.Rmd. Please edit that file -->
Expand All @@ -17,30 +20,34 @@ knitr::opts_chunk$set(
# styler

<!-- badges: start -->
[![R build status](https://github.com/r-lib/styler/workflows/R-CMD-check/badge.svg)](https://github.com/r-lib/styler/actions)
[![Life cycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html)
[![codecov test coverage](https://app.codecov.io/gh/r-lib/styler/branch/main/graph/badge.svg)](https://app.codecov.io/gh/r-lib/styler)
[![CRAN Status](https://www.r-pkg.org/badges/version/styler)](https://cran.r-project.org/package=styler)
<!-- badges: end -->

# Overview
[![R build
status](https://github.com/r-lib/styler/workflows/R-CMD-check/badge.svg)](https://github.com/r-lib/styler/actions)
[![Life cycle:
stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html)
[![codecov test
coverage](https://app.codecov.io/gh/r-lib/styler/branch/main/graph/badge.svg)](https://app.codecov.io/gh/r-lib/styler)
[![CRAN
Status](https://www.r-pkg.org/badges/version/styler)](https://cran.r-project.org/package=styler)

<!-- badges: end -->

styler formats your code according to the
[tidyverse style guide](https://style.tidyverse.org) (or your custom style guide)
so you can direct your attention to the content of your code. It helps to
keep the coding style consistent across projects and facilitate collaboration.
You can access styler through
# Overview

* the RStudio Addin as demonstrated below
* R functions like `style_pkg()`, `style_file()` or `style_text()`
* various other tools described in `vignette("third-party-integrations")`
styler formats your code according to the [tidyverse style
guide](https://style.tidyverse.org) (or your custom style guide) so you can
direct your attention to the content of your code. It helps to keep the coding
style consistent across projects and facilitate collaboration. You can access
styler through

- the RStudio Addin as demonstrated below
- R functions like `style_pkg()`, `style_file()` or `style_text()`
- various other tools described in `vignette("third-party-integrations")`

```{r, out.width = "650px", echo = FALSE}
knitr::include_graphics("https://raw.githubusercontent.com/lorenzwalthert/some_raw_data/master/styler_0.1.gif")
```


## Installation

You can install the package from CRAN.
Expand All @@ -60,6 +67,6 @@ remotes::install_github("r-lib/styler")

The following online docs are available:

- [latest CRAN release](https://styler.r-lib.org).
- [latest CRAN release](https://styler.r-lib.org).

- [GitHub development version](https://styler.r-lib.org/dev/).
- [GitHub development version](https://styler.r-lib.org/dev/).
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://
coverage](https://app.codecov.io/gh/r-lib/styler/branch/main/graph/badge.svg)](https://app.codecov.io/gh/r-lib/styler)
[![CRAN
Status](https://www.r-pkg.org/badges/version/styler)](https://cran.r-project.org/package=styler)

<!-- badges: end -->

# Overview
Expand Down
64 changes: 32 additions & 32 deletions cran-comments.md
Original file line number Diff line number Diff line change
@@ -1,49 +1,49 @@
---
editor_options:
markdown:
wrap: 79
---

This release does not check for a specific error message from `parse()` anymore
when the input involves unparsable use of `_`. The release was requested by Luke
Tierney.
when the input involves unparsable use of `_`. The release was requested by
Luke Tierney.

## Test environments


* ubuntu 18.04 (on GitHub Actions): R devel, R 4.1.2, R 4.0.5, R 3.6, R 3.5, R 3.4
* Windows Server 10 (on GitHub Actions): R 3.6, R 4.0.5
* win-builder: R devel
- ubuntu 18.04 (on GitHub Actions): R devel, R 4.1.2, R 4.0.5, R 3.6, R 3.5,
R 3.4
- Windows Server 10 (on GitHub Actions): R 3.6, R 4.0.5
- win-builder: R devel

## R CMD check results

0 ERRORS | 0 WARNINGS | 1 NOTES
0 ERRORS \| 0 WARNINGS \| 1 NOTES

The note was generated on winbuilder when incoming checks were enabled only and
contained many blocks like this:
The note was generated on winbuilder when incoming checks were enabled only and
contained many blocks like this:

```
Found the following (possibly) invalid URLs:
URL: https://github.com/ropensci/drake
From: inst/doc/third-party-integrations.html
NEWS.md
Status: 429
Message: Too Many Requests
```
Found the following (possibly) invalid URLs:
URL: https://github.com/ropensci/drake
From: inst/doc/third-party-integrations.html
NEWS.md
Status: 429
Message: Too Many Requests

It seems my package contains many URLs to GitHub and their rate limit prevents
the checking of all of them. I confirm that all URLs in my
package are compliant with the requirements of CRAN.
the checking of all of them. I confirm that all URLs in my package are
compliant with the requirements of CRAN.

## Downstream Dependencies

I also ran R CMD check on all downstream dependencies of styler using the
revdepcheck package. The
downstream dependencies are:
I also ran R CMD check on all downstream dependencies of styler using the
revdepcheck package. The downstream dependencies are:

* Reverse imports: biocthis, boomer, exampletestr, flow, iNZightTools,
languageserver, questionr, shinymeta, shinyobjects, ShinyQuickStarter,
systemPipeShiny, tidypaleo.


* Reverse suggests: admiral, autothresholdr, crunch, datastructures, drake,
epigraphdb, ghclass, knitr, multiverse, nph, precommit, reprex, shiny.react,
shinydashboardPlus, shinyMonacoEditor, upsetjs, usethis.
- Reverse imports: biocthis, boomer, exampletestr, flow, iNZightTools,
languageserver, questionr, shinymeta, shinyobjects, ShinyQuickStarter,
systemPipeShiny, tidypaleo.

- Reverse suggests: admiral, autothresholdr, crunch, datastructures, drake,
epigraphdb, ghclass, knitr, multiverse, nph, precommit, reprex,
shiny.react, shinydashboardPlus, shinyMonacoEditor, upsetjs, usethis.

All of them finished R CMD CHECK with zero (0) ERRORS, WARNINGS and
NOTES.
All of them finished R CMD CHECK with zero (0) ERRORS, WARNINGS and NOTES.
2 changes: 2 additions & 0 deletions inst/WORDLIST
Original file line number Diff line number Diff line change
Expand Up @@ -215,6 +215,7 @@ sprintf
stackoverflow
StackOverflow
startsWith
staticimports
STR
styler
stylerignore
Expand Down Expand Up @@ -262,6 +263,7 @@ VignetteBuilder
Visit'em
walthert
Walthert
wch
winbuilder
withr
writeLines
Expand Down
95 changes: 19 additions & 76 deletions vignettes/caching.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -18,29 +18,13 @@ knitr::opts_chunk$set(
library(styler)
```

This is a developer vignette to explain how caching works and what we learned on
the way. To use the caching feature, please have a look at the README.
This is a developer vignette to explain how caching works and what we learned on the way. To use the caching feature, please have a look at the README.

The main caching features were implemented in the following two pull requests:

- #538: Implemented simple caching and utilities for managing caches. Input text
is styled as a whole and added to the cache afterwards. This makes most sense
given that the very same expression will probably never be passed to styler,
unless it is already compliant with the style guide. Apart from the
(negligible) innode, caching text has a memory cost of 0. Speed boosts only
result if the whole text passed to styler is compliant to the style guide in
use. Changing one line in a file with hundreds of lines means each line will
be styled again. This is a major drawback and makes the cache only useful for
a use with a pre-commit framework (the initial motivation) or when functions
like `style_pkg()` are run often and most files were not changed.

- #578: Adds a second layer of caching by caching top-level expressions
individually. This will bring speed boosts to the situation where very little
is changed but there are many top-level expressions. Hence, changing one line
in a big file will invalidate the cache for the expression the line is part
of, i.e. when changing `x <- 2` to `x = 2` below, styler will have to restyle
the function definition, but not `another(call)` and all other expressions
that were not changed.
- #538: Implemented simple caching and utilities for managing caches. Input text is styled as a whole and added to the cache afterwards. This makes most sense given that the very same expression will probably never be passed to styler, unless it is already compliant with the style guide. Apart from the (negligible) innode, caching text has a memory cost of 0. Speed boosts only result if the whole text passed to styler is compliant to the style guide in use. Changing one line in a file with hundreds of lines means each line will be styled again. This is a major drawback and makes the cache only useful for a use with a pre-commit framework (the initial motivation) or when functions like `style_pkg()` are run often and most files were not changed.

- #578: Adds a second layer of caching by caching top-level expressions individually. This will bring speed boosts to the situation where very little is changed but there are many top-level expressions. Hence, changing one line in a big file will invalidate the cache for the expression the line is part of, i.e. when changing `x <- 2` to `x = 2` below, styler will have to restyle the function definition, but not `another(call)` and all other expressions that were not changed.

```{r, eval = FALSE}
function() {
Expand All @@ -51,59 +35,18 @@ function() {
another(call)
```

While #538 also required a lot of thought, this is not necessarily visible in
the diff. The main challenge was to figure out how the caching should work
conceptually and where we best insert the functionality as well as how to make
caching work for edge cases like trailing blank lines etc. For details on the
conceptual side and requirements, see #538.

In comparison, the diff in #578 is much larger. We can walk through the main
changes introduced here:

- Each nest gained a column *is_cached* to indicate if an expression is cached.
It's only ever set for the top-level nest, but set to `NA` for all other
nests. Also, comments are not cached because they are essentially top level
terminals which are very cheap to style (also because hardly any rule concerns
them) and because each comment is a top-level expression, simply styling them
is cheaper than checking for each of them if it is in the cache.

- Each nest also gained a column *block* to denote the block to which it belongs
for styling. Running each top-level expression through
`parse_transform_serialize_r()` separately is relatively expensive. We prefer
to put multiple top-level expressions into a block and process the block. This
is done with `parse_transform_serialize_r_block()`. Note that before we
implemented this PR, all top-level expressions were sent through
`parse_transform_serialize_r()` as one block. Leaving out some exceptions in
this explanation, we always put uncached top-level expressions in a block and
cached top-level expressions into a block and then style the uncached ones.

- Apart from the actual styling, a very costly part of formatting code with
styler is to compute the nested parse data with `compute_parse_data_nested()`.
When caching top-level expressions, it is evident that building up the nested
structure for cached code is unnecessary because we don't actually style it,
but simply return `text`. For this reason, we introduce the concept of a
shallow nest. It can only occur at the top level. For the top-level
expressions we know that they are cached, we remove all children before
building up the nested parse table and let them act as `terminals` and will
later simply return their `text`. Hence, in the nested parse table, no cached
expressions have children.

- Because we now style blocks of expressions and we want to preserve the line
breaks between them, we need to keep track of all blank lines between
expressions, which was not necessary previously because all expressions were
in a block and the blank lines separating them were stored in `newlines` and
`lag_newlines` except for all blank lines before the first expression.

- Because we wanted to cache by expression, but process by block of expression,
we needed to decompose the block into individual expressions and add them to
the cache once we obtained the final text. We could probably also have added
expressions to the cache before we put the text together, but the problem is
that at some point we turn the nested structure into a flat structure and as
this must happen with a `post_visit()` approach, we'd have to implement a
complicated routine to check if we are now about to put together all top-level
expressions and then if yes write them to the cache. A simple (but maybe not
so elegant) parsing of the output as implemented in `cache_by_expression()`
seemed reasonable in terms of limiting complexity and keeping efficiency.

For more detailed explanation and documentation, please consult the help files
of the internals.
While #538 also required a lot of thought, this is not necessarily visible in the diff. The main challenge was to figure out how the caching should work conceptually and where we best insert the functionality as well as how to make caching work for edge cases like trailing blank lines etc. For details on the conceptual side and requirements, see #538.

In comparison, the diff in #578 is much larger. We can walk through the main changes introduced here:

- Each nest gained a column *is_cached* to indicate if an expression is cached. It's only ever set for the top-level nest, but set to `NA` for all other nests. Also, comments are not cached because they are essentially top level terminals which are very cheap to style (also because hardly any rule concerns them) and because each comment is a top-level expression, simply styling them is cheaper than checking for each of them if it is in the cache.

- Each nest also gained a column *block* to denote the block to which it belongs for styling. Running each top-level expression through `parse_transform_serialize_r()` separately is relatively expensive. We prefer to put multiple top-level expressions into a block and process the block. This is done with `parse_transform_serialize_r_block()`. Note that before we implemented this PR, all top-level expressions were sent through `parse_transform_serialize_r()` as one block. Leaving out some exceptions in this explanation, we always put uncached top-level expressions in a block and cached top-level expressions into a block and then style the uncached ones.

- Apart from the actual styling, a very costly part of formatting code with styler is to compute the nested parse data with `compute_parse_data_nested()`. When caching top-level expressions, it is evident that building up the nested structure for cached code is unnecessary because we don't actually style it, but simply return `text`. For this reason, we introduce the concept of a shallow nest. It can only occur at the top level. For the top-level expressions we know that they are cached, we remove all children before building up the nested parse table and let them act as `terminals` and will later simply return their `text`. Hence, in the nested parse table, no cached expressions have children.

- Because we now style blocks of expressions and we want to preserve the line breaks between them, we need to keep track of all blank lines between expressions, which was not necessary previously because all expressions were in a block and the blank lines separating them were stored in `newlines` and `lag_newlines` except for all blank lines before the first expression.

- Because we wanted to cache by expression, but process by block of expression, we needed to decompose the block into individual expressions and add them to the cache once we obtained the final text. We could probably also have added expressions to the cache before we put the text together, but the problem is that at some point we turn the nested structure into a flat structure and as this must happen with a `post_visit()` approach, we'd have to implement a complicated routine to check if we are now about to put together all top-level expressions and then if yes write them to the cache. A simple (but maybe not so elegant) parsing of the output as implemented in `cache_by_expression()` seemed reasonable in terms of limiting complexity and keeping efficiency.

For more detailed explanation and documentation, please consult the help files of the internals.
Loading

0 comments on commit 17800da

Please sign in to comment.