Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Probability of Direction (pd) #428

Closed

Conversation

DominiqueMakowski
Copy link

Indices of effect significance/existence computed from posterior distributions are very common in some fields. This feature adds the probability of direction, p_direction(), a Bayesian equivalent of the p-value.

I followed the implementation of mean() by using the summarize() workflow.

@devmotion
Copy link
Member

Thank you for the PR! I'd suggest, however, opening a PR to MCMCDiagnosticTools (first) - most (all?) diagnostics specialized in MCMCChains for Chains objects are actually owned by MCMCDiagnosticTools and defined there for chains in a generic AbstractArray format. The implementations in MCMCDiagnosticTools are not only used by MCMCChains but e.g. also by ArviZ.jl.

@codecov
Copy link

codecov bot commented Jul 11, 2023

Codecov Report

Patch coverage has no change and project coverage change: -0.56 ⚠️

Comparison is base (82def13) 84.62% compared to head (b38f364) 84.07%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #428      +/-   ##
==========================================
- Coverage   84.62%   84.07%   -0.56%     
==========================================
  Files          20       21       +1     
  Lines        1067     1074       +7     
==========================================
  Hits          903      903              
- Misses        164      171       +7     
Impacted Files Coverage Δ
src/MCMCChains.jl 100.00% <ø> (ø)
src/significance.jl 0.00% <0.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@DominiqueMakowski
Copy link
Author

Should I do that even though it's not technically a "diagnostic" index? It's more a summary index, similar to HPD.

If so, could you point me to an example of a function implemented there that I can follow?

Copy link
Member

@yebai yebai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is ok to keep this here for reasons that @DominiqueMakowski mentioned.

@devmotion
Copy link
Member

I think MCMCDiagnosticTools would be preferable because it can be re-used easily by ArviZ, MCMCChains, and other array-based chain structures. The implementation would be even simpler than in this PR as you don't have to deal with the Chains internals, you would just return these values for a ::AbstractArray{<:Union{Missing,Real}} chain(s) of shape (draws[, chains[, params...]]) (see e.g. https://github.com/TuringLang/MCMCDiagnosticTools.jl/blob/main/src/mcse.jl).

Copy link
Member

@devmotion devmotion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a few comments that I think would be good to address, regardless of where the functionality is added.

Moreover, this would need a few tests.

@@ -0,0 +1,19 @@
function p_direction(x::Vector{Float64})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be problematic that this is only defined for Vector - that means e.g. that it won't work for slices of rows or columns.

@@ -0,0 +1,19 @@
function p_direction(x::Vector{Float64})
return maximum([sum(x .> 0) ./ length(x), sum(x .< 0) ./ length(x)])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this creates a few unnecessary allocations (the array inside of maximum, x .> 0, and x .< 0). This could be fixed e.g. by

Suggested change
return maximum([sum(x .> 0) ./ length(x), sum(x .< 0) ./ length(x)])
ntotal = 0
npositive = 0
nnegative = 0
for xi in x
if xi > 0
npositive += 1
elseif xi < 0
nnegative += 1
end
ntotal += 1
end
return max(npositive, nnegative) / ntotal

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in disbelief that this code is more efficient than the above one-liner, but I checked with @btime and indeed... 🤯

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one-liner creates quite a few allocations (zero allocations in the alternative) and it iterates over the array at least twice (one loop in the suggestion).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in disbelief that this code is more efficient than the above one-liner, but I checked with @btime and indeed... 🤯

If I had a nickle for each time I thought I had written the best code and then David has a better solution -- I'd be so rich

Comment on lines +6 to +13
# Store everything.
funs = [p_direction]
func_names = [:p_direction]

# Summarize.
summary_df = summarize(
chains, funs...;
func_names=func_names,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid alloctions:

Suggested change
# Store everything.
funs = [p_direction]
func_names = [:p_direction]
# Summarize.
summary_df = summarize(
chains, funs...;
func_names=func_names,
# Summarize.
summary_df = summarize(
chains, p_direction;
func_names=[:p_direction],

@yebai
Copy link
Member

yebai commented Aug 30, 2023

Closed in favour of arviz-devs/PosteriorStats.jl#10

@yebai yebai closed this Aug 30, 2023
@DominiqueMakowski DominiqueMakowski deleted the p_direction branch August 30, 2023 09:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants