-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-14441: [R] Add our philosophy to the dev vignette #11705
Conversation
thisisnic
commented
Nov 15, 2021
•
edited
Loading
edited
- separates some of the developer docs into separate files (content remains unchanged)
- updates the original dev docs page to point to these other pages and discuss our philosophy when implementing bindings
|
There are a number of scripts that are triggered when the arrow R package is installed. For package users who are not interacting with the underlying code, these should all just work without configuration and pull in the most complete pieces (e.g. official binaries that we host). However, knowing about these scripts can help package developers troubleshoot if things go wrong in them or things go wrong in an install. See [the installation vignette](./install.html#how-dependencies-are-resolved) for more information. | ||
* where necessary add extra arguments to the function signature for features | ||
that don't exist in R but do in Arrow (e.g. passing in a schema when reading a | ||
CSV dataset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this vignette needs more after here but I don't know exactly what. Maybe something on writing bindings between compute kernels and R functions? Or is that a bit too specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an example of a compute kernel binding would be good. We could use that as an example to further explain the above points.
I'd also suggest linking to recent PRs that may serve an examples. I find a reference PR particularly helpful in reminding me which files I might need to modify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point, agreed on the reference PR. I'm also gonna tag some of the wider R dev team to pitch in on this, as I can't help but feeling there's a bit more to discuss on what docs we distribute with the package vs. what docs we just have on the pkgdown site.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jonkeane What are your thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I think of the pkgdown site as the canonical documentation, especially for vignettes. So I don't have a strong opinion / I'm not worried about not including it inside of / distributed along side the package.
As for examples: I think that would be great. We can link to PRs (though if the link goes to CRAN it's susceptible to a redirect/rot that will anger CRAN — we should and do check for that, but just a reminder). Though sometimes when writing examples I find it a little bit easier to make a dedicated example that has lots of extra comments/commentary/possibly even glosses over some of the reality involved with implementing them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for organizing this! These developer docs are already quite nice and I'm glad to see them further enhanced.
Saw one broken link, and then a few other suggestions.
* [setting up a development environment and building the components that make up the Arrow project and R package](https://arrow.apache.org/docs/r/articles/developers/setup.html) | ||
* [common Arrow dev workflow tasks](https://arrow.apache.org/docs/r/articles/developers/workflow.html) | ||
* [running R with the C++ debugger attached](https://arrow.apache.org/docs/r/articles/developers/debugging.html) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these could be relative links:
* [setting up a development environment and building the components that make up the Arrow project and R package](https://arrow.apache.org/docs/r/articles/developers/setup.html) | |
* [common Arrow dev workflow tasks](https://arrow.apache.org/docs/r/articles/developers/workflow.html) | |
* [running R with the C++ debugger attached](https://arrow.apache.org/docs/r/articles/developers/debugging.html) | |
* [setting up a development environment and building the components that make up the Arrow project and R package](developers/setup.html) | |
* [common Arrow dev workflow tasks](developers/workflow.html) | |
* [running R with the C++ debugger attached](developers/debugging.html) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately not - when the package is built, everything in the vignettes
directory is distributed with the package as a HTML document, whereas everything in any subdirectories is only displayed on the pkgdown
site. This means that the relative links wouldn't work for anyone viewing this vignette locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohhh. Does this mean we are making parts of these developer docs not available offline?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there's no real need to distribute them with the package given that most people read this content via the pkgdown site anyway.
There are a number of scripts that are triggered when the arrow R package is installed. For package users who are not interacting with the underlying code, these should all just work without configuration and pull in the most complete pieces (e.g. official binaries that we host). However, knowing about these scripts can help package developers troubleshoot if things go wrong in them or things go wrong in an install. See [the installation vignette](./install.html#how-dependencies-are-resolved) for more information. | ||
* where necessary add extra arguments to the function signature for features | ||
that don't exist in R but do in Arrow (e.g. passing in a schema when reading a | ||
CSV dataset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an example of a compute kernel binding would be good. We could use that as an example to further explain the above points.
I'd also suggest linking to recent PRs that may serve an examples. I find a reference PR particularly helpful in reminding me which files I might need to modify.
|
||
Windows and macOS users who wish to contribute to the R package and | ||
don't need to alter libarrow (Arrow's C++ library) may be able to obtain a | ||
recent version of the library without building from source. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we're in here, could we fold in the spirit of https://issues.apache.org/jira/browse/ARROW-14371?
Specifically:
- A note that the brew-based method and the build-your-own methods are incompatible (mostly because of ARROW_HOME, but folks using brew shouldn't need to even think about that, so we should be careful how we phrase this)
- A note about confirming that brew install apache-arrow --HEAD completes successfully and how to confirm it's being picked up in the install process
- This might be part of the point above, but a description of what the difference/meaning of the following are: *** Using Homebrew ${PKG_BREW_NAME}, *** Arrow C++ libraries found via pkg-config, any mention of autobrew. Specifically, if one is trying to use homebrew for development, only the first one is ok, if someone sees something else that means that something isn't quite right.
If this is too much or you don't want to extend scope, that's totally fine!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will leave this for now as I don't fully understand all of these things, so am leaving it for another PR
c807161
to
8c96be7
Compare
@jonkeane and @wjones127 - just rebased this so now the setup instructions contain the excellent changes made by @wjones127 including all that Windows content. Just thinking - I am in the process of writing up some content on writing bindings, but it's a bigger piece of work. Any objections to me doing that in a separate follow-up ticket, so we can potentially make other changes to the setup doc without having to rebase again? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think doing the example as a follow-up makes sense.
I'll cover the bindings stuff in https://issues.apache.org/jira/browse/ARROW-14757 |
@github-actions crossbow submit test-r-devdocs |
Revision: 1eec3a0 Submitted crossbow builds: ursacomputing/crossbow @ actions-1168
|
This looks good. Just to confirm, the contents of r/vignettes/developers/workflow.Rmd and r/vignettes/developers/setup.Rmd were all just copy/pasted and don't have substantive edits do they? If they do, would you mind pointing them out so we can make sure to take a look at them? Also I presume you've run pkgdown locally to see that this all looks ok. NBD if it isn't perfect on the first merge, we've also got lots of time to look at it on the dev docs site before the 7.0.0 release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, pending the crossbow job. If that passes feel free to merge
Yeah, no substantive edits and will make sure I run pkgdown to check it all before merging. |
Benchmark runs are scheduled for baseline = 7a9738a and contender = e417fbf. e417fbf is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |