Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example of ordinal regression with ImproperDistribution #619

Closed
wants to merge 2 commits into from

Conversation

vanAmsterdam
Copy link
Contributor

as requested here #612 (comment)

can someone help out with updating the docs? Haven't done that before

@fehiepsi
Copy link
Member

fehiepsi commented Jun 5, 2020

Hi @vanAmsterdam , the content looks great. Thanks for contributing your example! Please don't worry about the doc format, we'll check it again.

Could you add an introductory paragraph talking a bit about ordinal regression? That would be helpful for users to get an overview on what is the purpose of the tutorial.

Also, might I add a section on the usage of

c_y = sample("c_y", dist.TransformedDistribution(dist.Normal(0, 1).expand([nclasses]),
                                                 OrderedTransform()))

to put a weak prior on the intercepts, instead of the cutpoints c_y? I can make a separate PR or add commits to your PR, depending on which one you prefer.

@vanAmsterdam
Copy link
Contributor Author

Thanks; yes I'll write something;

Do you mean adding the TransformedDistribution method as model3, another way of getting the same result?
I'm not sure I follow what you mean with intercepts as opposed to cutpoints. What intercepts do you mean?

@fehiepsi
Copy link
Member

fehiepsi commented Jun 6, 2020

Sorry, I used the language in Statistical Rethinking book. It seems that that terminology is not common.

What I meant is the models in the tutorial use improper priors for cutpoints. In my last comment, we use a "proper" prior for those cutpoints. The motivation is:

  • assume that predictor is 0, cutpoints c orderly lie in the log-odds space (y-axis of this figure) and each c_i corresponds to a cumulative probability p_i of ordered categories (i.e. if the outputs are scores of students, then p_i is the probability that the score <= i)
  • We can put positive-support priors on the differences (c_{i + 1} - c_{i}) of those cutpoints. In my last comment, I used LogNormal(0, 1) priors. That also means that log(c_{i+1} - c_{i}) (which belongs to the real domain of OrderedTransform - they are called intercepts in the above book IIRC) has Normal(0,1) prior.

Does that make sense to you? Do you want to write some words about it in the tutorial?

@fehiepsi
Copy link
Member

@vanAmsterdam Feel free to ignore my comment on the usage of TransformedDistribution. We can improve the content in a follow-up PR. :)

@vanAmsterdam
Copy link
Contributor Author

vanAmsterdam commented Jun 19, 2020 via email

@fehiepsi
Copy link
Member

Absolutely no! I just worried that you found the usage of TransformedDistribution a bit counterintuitively to address. Please take your time! I hope you enjoy doing open-source contribution so I don't want something like 'hurry' take away such pleasure. ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants