Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-14441: [R] Add our philosophy to the dev vignette #11705

Closed
wants to merge 2 commits into from

Conversation

thisisnic
Copy link
Member

@thisisnic thisisnic commented Nov 15, 2021

  • separates some of the developer docs into separate files (content remains unchanged)
  • updates the original dev docs page to point to these other pages and discuss our philosophy when implementing bindings

@github-actions
Copy link

@github-actions
Copy link

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

@thisisnic thisisnic marked this pull request as ready for review November 17, 2021 16:04
There are a number of scripts that are triggered when the arrow R package is installed. For package users who are not interacting with the underlying code, these should all just work without configuration and pull in the most complete pieces (e.g. official binaries that we host). However, knowing about these scripts can help package developers troubleshoot if things go wrong in them or things go wrong in an install. See [the installation vignette](./install.html#how-dependencies-are-resolved) for more information.
* where necessary add extra arguments to the function signature for features
that don't exist in R but do in Arrow (e.g. passing in a schema when reading a
CSV dataset)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this vignette needs more after here but I don't know exactly what. Maybe something on writing bindings between compute kernels and R functions? Or is that a bit too specific?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an example of a compute kernel binding would be good. We could use that as an example to further explain the above points.

I'd also suggest linking to recent PRs that may serve an examples. I find a reference PR particularly helpful in reminding me which files I might need to modify.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point, agreed on the reference PR. I'm also gonna tag some of the wider R dev team to pitch in on this, as I can't help but feeling there's a bit more to discuss on what docs we distribute with the package vs. what docs we just have on the pkgdown site.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonkeane What are your thoughts on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I think of the pkgdown site as the canonical documentation, especially for vignettes. So I don't have a strong opinion / I'm not worried about not including it inside of / distributed along side the package.

As for examples: I think that would be great. We can link to PRs (though if the link goes to CRAN it's susceptible to a redirect/rot that will anger CRAN — we should and do check for that, but just a reminder). Though sometimes when writing examples I find it a little bit easier to make a dedicated example that has lots of extra comments/commentary/possibly even glosses over some of the reality involved with implementing them.

Copy link
Member

@wjones127 wjones127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for organizing this! These developer docs are already quite nice and I'm glad to see them further enhanced.

Saw one broken link, and then a few other suggestions.

r/vignettes/developers/setup.Rmd Show resolved Hide resolved
Comment on lines +45 to +16
* [setting up a development environment and building the components that make up the Arrow project and R package](https://arrow.apache.org/docs/r/articles/developers/setup.html)
* [common Arrow dev workflow tasks](https://arrow.apache.org/docs/r/articles/developers/workflow.html)
* [running R with the C++ debugger attached](https://arrow.apache.org/docs/r/articles/developers/debugging.html)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these could be relative links:

Suggested change
* [setting up a development environment and building the components that make up the Arrow project and R package](https://arrow.apache.org/docs/r/articles/developers/setup.html)
* [common Arrow dev workflow tasks](https://arrow.apache.org/docs/r/articles/developers/workflow.html)
* [running R with the C++ debugger attached](https://arrow.apache.org/docs/r/articles/developers/debugging.html)
* [setting up a development environment and building the components that make up the Arrow project and R package](developers/setup.html)
* [common Arrow dev workflow tasks](developers/workflow.html)
* [running R with the C++ debugger attached](developers/debugging.html)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately not - when the package is built, everything in the vignettes directory is distributed with the package as a HTML document, whereas everything in any subdirectories is only displayed on the pkgdown site. This means that the relative links wouldn't work for anyone viewing this vignette locally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhh. Does this mean we are making parts of these developer docs not available offline?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there's no real need to distribute them with the package given that most people read this content via the pkgdown site anyway.

r/vignettes/developing.Rmd Show resolved Hide resolved
There are a number of scripts that are triggered when the arrow R package is installed. For package users who are not interacting with the underlying code, these should all just work without configuration and pull in the most complete pieces (e.g. official binaries that we host). However, knowing about these scripts can help package developers troubleshoot if things go wrong in them or things go wrong in an install. See [the installation vignette](./install.html#how-dependencies-are-resolved) for more information.
* where necessary add extra arguments to the function signature for features
that don't exist in R but do in Arrow (e.g. passing in a schema when reading a
CSV dataset)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an example of a compute kernel binding would be good. We could use that as an example to further explain the above points.

I'd also suggest linking to recent PRs that may serve an examples. I find a reference PR particularly helpful in reminding me which files I might need to modify.


Windows and macOS users who wish to contribute to the R package and
don't need to alter libarrow (Arrow's C++ library) may be able to obtain a
recent version of the library without building from source.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we're in here, could we fold in the spirit of https://issues.apache.org/jira/browse/ARROW-14371?

Specifically:

  • A note that the brew-based method and the build-your-own methods are incompatible (mostly because of ARROW_HOME, but folks using brew shouldn't need to even think about that, so we should be careful how we phrase this)
  • A note about confirming that brew install apache-arrow --HEAD completes successfully and how to confirm it's being picked up in the install process
  • This might be part of the point above, but a description of what the difference/meaning of the following are: *** Using Homebrew ${PKG_BREW_NAME}, *** Arrow C++ libraries found via pkg-config, any mention of autobrew. Specifically, if one is trying to use homebrew for development, only the first one is ok, if someone sees something else that means that something isn't quite right.

If this is too much or you don't want to extend scope, that's totally fine!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will leave this for now as I don't fully understand all of these things, so am leaving it for another PR

@thisisnic thisisnic force-pushed the ARROW-14713_split_dev branch from c807161 to 8c96be7 Compare November 19, 2021 14:41
@thisisnic
Copy link
Member Author

thisisnic commented Nov 19, 2021

@jonkeane and @wjones127 - just rebased this so now the setup instructions contain the excellent changes made by @wjones127 including all that Windows content. Just thinking - I am in the process of writing up some content on writing bindings, but it's a bigger piece of work. Any objections to me doing that in a separate follow-up ticket, so we can potentially make other changes to the setup doc without having to rebase again?

Copy link
Member

@wjones127 wjones127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think doing the example as a follow-up makes sense.

@thisisnic
Copy link
Member Author

I'll cover the bindings stuff in https://issues.apache.org/jira/browse/ARROW-14757

@jonkeane
Copy link
Member

@github-actions crossbow submit test-r-devdocs

@github-actions
Copy link

Revision: 1eec3a0

Submitted crossbow builds: ursacomputing/crossbow @ actions-1168

Task Status
test-r-devdocs Github Actions

@jonkeane
Copy link
Member

This looks good. Just to confirm, the contents of r/vignettes/developers/workflow.Rmd and r/vignettes/developers/setup.Rmd were all just copy/pasted and don't have substantive edits do they? If they do, would you mind pointing them out so we can make sure to take a look at them?

Also I presume you've run pkgdown locally to see that this all looks ok. NBD if it isn't perfect on the first merge, we've also got lots of time to look at it on the dev docs site before the 7.0.0 release

Copy link
Member

@jonkeane jonkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, pending the crossbow job. If that passes feel free to merge

@thisisnic
Copy link
Member Author

This looks good. Just to confirm, the contents of r/vignettes/developers/workflow.Rmd and r/vignettes/developers/setup.Rmd were all just copy/pasted and don't have substantive edits do they? If they do, would you mind pointing them out so we can make sure to take a look at them?

Also I presume you've run pkgdown locally to see that this all looks ok. NBD if it isn't perfect on the first merge, we've also got lots of time to look at it on the dev docs site before the 7.0.0 release

Yeah, no substantive edits and will make sure I run pkgdown to check it all before merging.

@thisisnic thisisnic closed this in e417fbf Nov 23, 2021
@ursabot
Copy link

ursabot commented Nov 23, 2021

Benchmark runs are scheduled for baseline = 7a9738a and contender = e417fbf. e417fbf is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.18% ⬆️0.09%] ursa-thinkcentre-m75q
Supported benchmarks:
ursa-i9-9960x: langs = Python, R, JavaScript
ursa-thinkcentre-m75q: langs = C++, Java
ec2-t3-xlarge-us-east-2: cloud = True

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants