Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Follow up on source calls #605

Closed
10 of 11 tasks
EagleoutIce opened this issue Jan 18, 2024 · 0 comments · Fixed by #609
Closed
10 of 11 tasks

Feat: Follow up on source calls #605

EagleoutIce opened this issue Jan 18, 2024 · 0 comments · Fixed by #609
Assignees
Labels
dataflow Related to dataflow extraction enhancement New feature or request

Comments

@EagleoutIce
Copy link
Member

EagleoutIce commented Jan 18, 2024

In R, users can include other files with the help of the source function like this:

# some amazing code
# source for the current environment
source("another/file/another/bit/of/fun.R")

# some amazing code which can refer to sourced variables

Now, we want to support this in flowR, for this we need to:

  • detect calls to base::source (i.e., check that source has not been redefined).
  • if found, follow the source (currently, only if the argument is given as a constant).
    • this could be checked whenever the dataflow processor is checking a named function call
  • check if the file exists, if not, warn, otherwise retrieve the file
  • run parse and normalize on the file (in other words, the steps required before DF analysis)
  • run the dataflow analysis on the normalized, sourced file using the current environment

It should:

  • work recursively (a.R sources b.r, sources c.R, which may source a.R again).
    it is fine if we use a hard cut-off to deal with recursion or reuse existing dataflow graphs to
    integrate until we reach a fixpoint
  • work within control flow constructs (i.e., if it happens within an if)
  • be general enough to be theoretically testable independent from the complete flowR pipeline

It would be nice if/it could be extended by:

  • it is configurable (so that when starting the dataflow analysis one can configure flowR so that it ignores source calls to get the original behavior) (tracked in Add a startup configuration file for flowr #642)
  • graphs can be cached on a per-file basis in case one script chooses to source another one
    multiple times
  • For the tests, make sure that the "file provider abstraction" can be mocked so that we can use source in tests without actually referring to files on the host file system.

This is to be done using the main branch and independent from v2 (although the plan is to port it later).

@EagleoutIce EagleoutIce added enhancement New feature or request dataflow Related to dataflow extraction labels Jan 18, 2024
@Ellpeck Ellpeck self-assigned this Jan 18, 2024
@Ellpeck Ellpeck linked a pull request Jan 22, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dataflow Related to dataflow extraction enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants