-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEAT] Run multiple u training rounds to check stability #1179
Labels
Comments
Summarising this issue as "what if we're (unwittingly) estimating bad u values?", my thoughts are:
Example pandas code to show RSE error for ufrom dataengineeringutils3.s3 import read_json_from_s3
import pandas as pd
import numpy as np
pd.options.display.max_rows = 100
pd.set_eng_float_format(accuracy=3, use_eng_prefix=True)
path = "s3://alpha-data-linking/v4/model_training/person/dev/probation_delius/basic/2023-03-06/combined_model/settings.json"
model = read_json_from_s3(path)
sample_size = 3e8
u = {
c["output_column_name"]:[
p["u_probability"] for p in c["comparison_levels"] if "u_probability" in p.keys()
]
for c in model["comparisons"]
}
df = pd.DataFrame.from_dict(u, orient="index").reset_index()
df.columns = ['col', "u0", "u1", "u2", "u3", "u4", "u5", "u6"]
df = pd.wide_to_long(df, stubnames="u", i="col", j="level").reset_index()
df = df.dropna(0).sort_values(["col", "level"]).reset_index(drop=True)
# Percentage relative standard error
df["rse"] = np.sqrt((df.u * (1 - df.u)) / sample_size)/ df.u * 100
df.style.background_gradient(axis=0,subset=["u","rse"]).format('{:.1f}', subset="rse") |
See also #1060 - in particular by considering the cardinality and skew of columns, you could probably estimate the max rows needed to ensure a stable estimate of u values, rather than need to iterate |
@samnlindsay Note you can train u probabilities using em with this option Line 1136 in 021813b
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Is your proposal related to a problem?
As discussed in a separate PR adding a seed to u sampling.
If users train u with too small a sample, the only ways they can tell is by noticing if
In some cases, they could get lucky and train all u values and not even have scenario 1 to flag the issue (but it would crop up in later runs as the sample missed certain comparison levels).
Describe the solution you'd like
It would be helpful to add the ability to
estimate_u_using_random_sampling
to do multiple runs and compare the u values generated. This would likely be useful to set a default as multiple runs. E.g.Then this could be passed into
parameter_estimate_comparisons_chart
Given the u values are used to generate m - an average would have to be taken to generate m. It would also be useful to show this final average u value in the
parameter_estimate_comparisons_chart
.If the final u value is being included in
parameter_estimate_comparisons_chart
, it would also be useful to show the final m value as well as the individual training sessions (or at least have a parameter allowing it).Describe alternatives you've considered
Additional context
The text was updated successfully, but these errors were encountered: