-
-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce the new DDL (ddl class) #926
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docs and comments require some rephrasing.
R/ddl.R
Outdated
#' evaluation and code masking process it is recommended to use `ddl_run`. | ||
#' Package provides universal `username_password_server` which | ||
#' runs [ddl_run] function, which returns `teal_data` object. | ||
#' Details in the the example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#' Details in the the example | |
#' Details in the the example. |
R/ddl.R
Outdated
if (!missing(expr) && !missing(code)) { | ||
stop("Only one of `expr` or `code` should be specified") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The advantage of this is that we will be having consistent error message that end-users might be already familiar with.
(please implement in other places if necessary)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not particularly familiar with rlang
error messages myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rlang error system ftw!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the main features of rlang@1.0.0
is better display of the error messages - link. Since the release of this version, most (if not all?) of tidyverse packages followed this. That's why I do believe that this might be more coherent across all the packages one might use hence: easier to understand and take appropriate action.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had issues in the past year with rlang
handling errors
. Needed to use rlang::cnd_message_lines(error)
on Windows on every error
message in tryCatch
to get an accurate error
message. Otherwise errors were always empty. If people have Windows and old rlang
versions they may not see an error at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also vote up for striving away from tidyverse
as they have a great history of breaking changes and tend to have multiple dependencies in it. I don't think general error messages in rlang
are better than customized, personalized human-crafted ones that we thought through to inform the user in the most useful manner as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do echo similar mixed feeling about tidyverse but this is actually about r-lib which I consider much more solid.
I don't think general error messages in rlang are better than customized, personalized human-crafted ones that we thought through to inform the user in the most useful manner as possible.
Yes and no. Yes because the role of the error message is to clearly indicate what's wrong and why to allow a quick fix by the end users. No, because if we will be too verbose then it won't be clear anymore. The other thing that I am actually trying to address here is consistency. Just a few examples of our codebase that I managed to find within few minutes:
https://github.com/insightsengineering/teal.slice/blob/686f1f76ac6130325df9314fea419800c139f35a/R/teal_slice.R#L134-L135
https://github.com/insightsengineering/teal.modules.general/blob/eef36c1dda918fcbebe948d8bef189bd7e545974/R/tm_g_bivariate.R#L762-L764
https://github.com/insightsengineering/teal.data/blob/c04b7edef44d80fd4cdf0c6196d5c9c3de7c84b1/R/utils.R#L21-L24
https://github.com/insightsengineering/teal.slice/blob/686f1f76ac6130325df9314fea419800c139f35a/R/teal_slice.R#L134-L135
If we can delegate the check as well as error message for a separate functionality called from all of those places then we would achieve some consistency. It's mostly done via checkmate
for individual objects checks. I hope that rlang
is a good candidate to fill in the gaps of what cannot be done via checkmate
.
One might raise that we don't want to introduce a new dependency but rlang
is very likely to be already there (directly or indirectly). What's great about it is that this is a zero-dependency package. That changed my view on this package quite significantly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those 4 examples are actually fine in my opinion. Very clear :)
If rlang has zero dependencies than I'm less against, but still it's better to have a package that has 8 dependencies, than a packages with 15 dependencies where 7 of them has zero dependencies. The overview of such package become repellent. Each package extends the time of the installation. And each package require us and some more advanced users who would like to read the code to understand another workflow. How many workflows can we use in a teal designed to be simple and seamless for R users?
I would be okay with such an implementation as long as it's clear that the ddl module is a "special" module. Ideally, I would like fewer changes to create a |
I agree.
I disagree. |
Some more details for clarity's sake: An important change would be that the data could be supplied to the application either by passing (*) the |
Thanks @chlebowa for the explanation in your last comment! |
And I guess |
No. |
If |
Because it's difficult to arrange masking at the very end of the analysis as opposed to upon data creation.
|
I think it depends on where/when you pass masking parameters, instead passing to to |
Oh, even better. Which module? You never want to show unmasked code so would you rather mask it once, upon creation, or multiple times, in analytical modules? |
No, it's gonna be masked only in |
The code can still be added to the report, would it not be masked then? |
Alrighty @chlebowa so on |
I don't see how this is in any wat better than masking in the |
You postpone the masking part until it's really necessary. If a person does not create a report or does not check the Shown R Code, then the substitution action is not triggered at all. |
Conversely, if you do want it, you have to do it twice. |
Ok forget it. You can just not include it in DDL lol. |
# Replace input$ with .() | ||
code_strings <- gsub("input\\$(\\w+\\.?\\w*)", "\\.(\\1)", code_strings) | ||
code_strings <- gsub("(input\\$)(`[^`]+`)", "\\.(\\2)", code_strings) | ||
|
||
# Replace input[[ with .() | ||
code_strings <- gsub("(input\\[\\[\")(\\w+\\.?\\w*)(\"\\]\\])", "\\.(\\2\\)", code_strings) | ||
code_strings <- gsub("(input\\[\\[\")(\\w+\\-\\w+)\"\\]\\]", ".(`\\2`)", code_strings) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Replace input$ with .() | |
code_strings <- gsub("input\\$(\\w+\\.?\\w*)", "\\.(\\1)", code_strings) | |
code_strings <- gsub("(input\\$)(`[^`]+`)", "\\.(\\2)", code_strings) | |
# Replace input[[ with .() | |
code_strings <- gsub("(input\\[\\[\")(\\w+\\.?\\w*)(\"\\]\\])", "\\.(\\2\\)", code_strings) | |
code_strings <- gsub("(input\\[\\[\")(\\w+\\-\\w+)\"\\]\\]", ".(`\\2`)", code_strings) | |
code_strings <- gsub("(input\\$)([a-z0-9_.`-]+)", "\\.(\\2\\)", code_strings) | |
code_strings <- gsub("(input\\[\\[\")([a-z0-9_.`-]+)(\"\\]\\])", "\\.(\\2\\)", code_strings) | |
code_strings <- gsub("\\.\\(([^)`]*?)-([^)`]*?)\\)", ".(`\\1-\\2`)", code_strings) |
input[["var-name"]]
are converted .(var_name)
, addtional regaular expression add backticks around variable names to to ensure they are valid for substitution in R.
@@ -114,11 +114,10 @@ init <- function(data, | |||
footer = tags$p(), | |||
id = character(0)) { | |||
logger::log_trace("init initializing teal app with: data ({ class(data)[1] }).") | |||
|
|||
if (!inherits(data, c("TealData", "teal_data"))) { | |||
if (!inherits(data, c("TealData", "teal_data", "ddl"))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update to roxygen argument description should follow this change
I like the masking option, and the combination of "ddl/ddl_run" makes sense to me, so I prefer 1.1. Even though 1.2 and 2.2 are simpler, I think we should encourage the use of all DDL arguments correctly to effectively conceal the pre-processing aspect of DDL. I am okay to create vignette of the simpler way to cater our more advance users.
This is interesting. Instead of using a splash screen, we will present the DDL as a module as a new tab. How does this affect other modules though? Since ddl() creates teal_data which feeds to modules, if we serve ddl module as a new tab together with the rest of modules, how do we ensure that this module will be process first? I suppose we could implement a similar approach with the landing popup. We could assess the class of the module, and if it's a DDL module, we can temporarily block or disable the other modules. After the submit button is clicked and authentication is successful, we can then re-enable the other modules. |
I had the very same intuition and I follow your idea as well. Could be handled very similarly to the way we handle |
I think we have all agree that this class is less favourable and |
less favourable or more favourable ;)? |
Closing this to focus on https://github.com/insightsengineering/teal/pull/957/files |
Part of #937
Closes #895
install
Installation code
ddl class
unfold
Small class which contains
ui
andserver
.server
callsrun_ddl
which evaluates acode
using input and createteal_data
object. Functionddl
accepts several arguments which are used to generate properui
andserver
.Above diagram depict how class works. Please follow steps below and look at the diagram.
ddl
arguments to fit someone's needs. Take for examplecustom_connector
. Let's assume that we need to callscda::synthetic_cdisc_data('latest')
for two datasets.Imagine that above datasets are on the remote server so we need to call
open_conn
function and close after pull. Imagine thatopen_conn
andclose_conn
are in some external package.username
andpassword
should be specified by the app user. To do so we need to createui
which will createusername
andpassword
inputs which are used during evaluation of above code.If we have
ui
we need alsoserver
to utilize input. When submit button is clicked allinput
values are passed toddl_run
where all magic happens.ddl_run
executes the code and substitutes allinput$<name>
with equivalentinput_list
element.server
function needs to have...
in the formals where all necessary objects are passed through.Because user
username
andpassword
is not something we would like to include in the show R code. We need to replace them with some other code which can be run by anyone who receives the code. To replace args with some desired value or call we useoffline_args
to substitute code with these.At this point
ddl_run
evaluated "online" code, and has environment with the results of the evaluation. Also "offline" code is available after substitution. At the endteal_data
is constructed from data and offline codeddl_run
require onlyinput_list
. It is a simplification for potential app developer who doesn't need to handle additional arguments (code
,input_mask
) in theserver
. Consequence of this simplification is thatserver
specified inddl()
have an enclosing environment changed to the environment whereddl_run
,code
,input_mask
is available. Another consequence is thatddl_run
theoretically doesn't exist anywhere - it is created on the fly whenddl
object is created. It means that developing ownddl
connector in external "pkg" can't use private functions from that "pkg".Example teal app with ddl
teal app code
use
teal.data
,teal
andteal.slice
fromrefactor
branchInternal Connectors
insightsengineering/nestdevs-tasks#38