Post-processing write-up #27

topepo · 2023-05-12T18:46:49Z

No description provided.

post-processing/readme.qmd

DavisVaughan · 2023-05-15T13:05:05Z

post-processing/readme.qmd

+
+wflow_individ <- 
+  wflow_2 %>% 
+  add_prob_calibration(cal_object) %>% 


Maybe add a note about where cal_object comes from

post-processing/readme.qmd

DavisVaughan · 2023-05-15T13:06:44Z

post-processing/readme.qmd

+# 'object' is a pre-made calibration object
+# Data Inputs: class probabilities
+# Data Outputs: class probabilities and recomputed class predictions
+add_prob_calibration(object, .priority = 1.0)


All of these also have x as a first argument. Where x is the workflow

DavisVaughan · 2023-05-15T13:09:04Z

post-processing/readme.qmd

+# Potentially tunable
+# Data Inputs: class probabilities
+# Data Outputs: class predictions
+add_prob_threshold(threshold = numeric(), .priority = 2.0)


Might need a levels argument and an ordered argument? Like probably::make_two_class_pred()

DavisVaughan · 2023-05-15T13:12:12Z

post-processing/readme.qmd

+# Potentially tunable
+# Data Inputs: class probabilities
+# Data Outputs: class predictions
+add_cls_eq_zone(value = numeric(), threshold = numeric(), .priority = 3.0)


In probably probably::make_class_pred() had a buffer argument that created a range of [threshold - buffer[1], threshold + buffer[2]] where anything inside the buffer range was marked equivocal. Maybe you could use the buffer arg here?

Maybe also name it something similar to add_prob_threshold() like add_prob_threshold_buffered() where:

add_prob_threshold() always returns a factor (maybe ordered)

add_prob_threshold_buffered() always returns a <class_pred> from probably

DavisVaughan · 2023-05-15T13:16:18Z

post-processing/readme.qmd

+# User will have to always set the priority
+# Data Inputs: all predictions
+# Data Outputs: all predictions (only these columns are retained)
+add_post_mutate(..., .priority)


I would consider giving all of these functions a common prefix that differentiates them from the other add_*() functions, like add_post_*():

add_post_calibration() # do you need prob vs reg calibration? can you just "figure it out"? can it be an argument? add_post_threshold() add_post_threshold_buffered() add_post_mutate()

Conflicted on this. I agree this would be nice for tab completion, but is inconsistent with the naming convention for preprocessors: add_variables(), add_recipe(), add_formula().

DavisVaughan · 2023-05-15T13:20:52Z

post-processing/readme.qmd

+ - A container for the list of possible post-processors specified by the user.
+ - A validation system to resolve conflicts in type or priority. 
+ - An interface to apply the operations to the predicted values. 
+ - The requisite package dependencies (primarily the probably package)


I am mildly worried about the number of Imports that dev probably has. It is fairly high, and might be worth it to go back and see if some of them can be moved to Suggests as optional deps

DavisVaughan · 2023-05-15T13:23:00Z

post-processing/readme.qmd

+
+ - `new_stage_post()`, and `new_action_post()` are existing constructors. 
+
+We will require a `.fit_post(workflow, data)` that will execute _only_ the post-processing operations; the `workflows` object will already have trained the `pre` and `fit` stages. 


Since this is really an "internal" function, I think we can just assume that the workflow has already trained the pre and fit stages, i.e. we don't need to try to do any checks to see if that is true or not. It should only be used by workflows internally and tune

simonpcouch · 2023-05-16T16:49:49Z

Looking at this now—will merge some of Davis' suggestions and push some small edits. Will leave a review when finished!

Co-authored-by: Davis Vaughan <davis@rstudio.com>

simonpcouch

Few big-picture edits from me beyond Davis' comments!

simonpcouch · 2023-05-16T17:03:52Z

post-processing/readme.qmd

+ - An interface to apply the operations to the predicted values. 
+ - The requisite package dependencies (primarily the probably package).
+
+The dwai code may eventually make its way into workflows or probably. 


If this is our plan, I might argue we put this functionality in workflows or probably from the get-go. Feels a bit like this living its own package could eventually feel like technical debt.

topepo added 2 commits May 11, 2023 14:50

initial thoughts

c036f8e

render docs

a00afd6

topepo requested review from DavisVaughan and simonpcouch May 12, 2023 18:47

DavisVaughan reviewed May 15, 2023

View reviewed changes

simonpcouch and others added 2 commits May 16, 2023 12:50

apply suggestions from davis' code review

5918012

Co-authored-by: Davis Vaughan <davis@rstudio.com>

small copy edits

f5c00c1

simonpcouch approved these changes May 16, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Post-processing write-up #27

Post-processing write-up #27

topepo commented May 12, 2023

DavisVaughan May 15, 2023

DavisVaughan May 15, 2023

DavisVaughan May 15, 2023

DavisVaughan May 15, 2023

DavisVaughan May 15, 2023

simonpcouch May 16, 2023

DavisVaughan May 15, 2023

DavisVaughan May 15, 2023

simonpcouch commented May 16, 2023

simonpcouch left a comment

simonpcouch May 16, 2023


		- `new_stage_post()`, and `new_action_post()` are existing constructors.

		We will require a `.fit_post(workflow, data)` that will execute _only_ the post-processing operations; the `workflows` object will already have trained the `pre` and `fit` stages.

Post-processing write-up #27

Are you sure you want to change the base?

Post-processing write-up #27

Conversation

topepo commented May 12, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

simonpcouch commented May 16, 2023

simonpcouch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment