[Feature] Probability of Direction (pd) #428

DominiqueMakowski · 2023-07-11T20:46:50Z

Indices of effect significance/existence computed from posterior distributions are very common in some fields. This feature adds the probability of direction, p_direction(), a Bayesian equivalent of the p-value.

I followed the implementation of mean() by using the summarize() workflow.

devmotion · 2023-07-11T21:06:14Z

Thank you for the PR! I'd suggest, however, opening a PR to MCMCDiagnosticTools (first) - most (all?) diagnostics specialized in MCMCChains for Chains objects are actually owned by MCMCDiagnosticTools and defined there for chains in a generic AbstractArray format. The implementations in MCMCDiagnosticTools are not only used by MCMCChains but e.g. also by ArviZ.jl.

codecov · 2023-07-11T21:28:29Z

Codecov Report

Patch coverage has no change and project coverage change: -0.56 ⚠️

Comparison is base (82def13) 84.62% compared to head (b38f364) 84.07%.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #428      +/-   ##
==========================================
- Coverage   84.62%   84.07%   -0.56%     
==========================================
  Files          20       21       +1     
  Lines        1067     1074       +7     
==========================================
  Hits          903      903              
- Misses        164      171       +7

Impacted Files	Coverage Δ
src/MCMCChains.jl	`100.00% <ø> (ø)`
src/significance.jl	`0.00% <0.00%> (ø)`

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

DominiqueMakowski · 2023-07-12T08:28:42Z

Should I do that even though it's not technically a "diagnostic" index? It's more a summary index, similar to HPD.

If so, could you point me to an example of a function implemented there that I can follow?

yebai

I think it is ok to keep this here for reasons that @DominiqueMakowski mentioned.

devmotion · 2023-07-12T19:52:28Z

I think MCMCDiagnosticTools would be preferable because it can be re-used easily by ArviZ, MCMCChains, and other array-based chain structures. The implementation would be even simpler than in this PR as you don't have to deal with the Chains internals, you would just return these values for a ::AbstractArray{<:Union{Missing,Real}} chain(s) of shape (draws[, chains[, params...]]) (see e.g. https://github.com/TuringLang/MCMCDiagnosticTools.jl/blob/main/src/mcse.jl).

devmotion

I added a few comments that I think would be good to address, regardless of where the functionality is added.

Moreover, this would need a few tests.

devmotion · 2023-07-12T19:54:57Z

src/significance.jl

@@ -0,0 +1,19 @@
+function p_direction(x::Vector{Float64})


I think it might be problematic that this is only defined for Vector - that means e.g. that it won't work for slices of rows or columns.

devmotion · 2023-07-12T20:16:24Z

src/significance.jl

@@ -0,0 +1,19 @@
+function p_direction(x::Vector{Float64})
+    return maximum([sum(x .> 0) ./ length(x), sum(x .< 0) ./ length(x)])


Hmm, this creates a few unnecessary allocations (the array inside of maximum, x .> 0, and x .< 0). This could be fixed e.g. by

Suggested change

return maximum([sum(x .> 0) ./ length(x), sum(x .< 0) ./ length(x)])

ntotal = 0

npositive = 0

nnegative = 0

for xi in x

if xi > 0

npositive += 1

elseif xi < 0

nnegative += 1

end

ntotal += 1

end

return max(npositive, nnegative) / ntotal

I am in disbelief that this code is more efficient than the above one-liner, but I checked with @btime and indeed... 🤯

The one-liner creates quite a few allocations (zero allocations in the alternative) and it iterates over the array at least twice (one loop in the suggestion).

I am in disbelief that this code is more efficient than the above one-liner, but I checked with @btime and indeed... 🤯

If I had a nickle for each time I thought I had written the best code and then David has a better solution -- I'd be so rich

devmotion · 2023-07-12T20:17:23Z

src/significance.jl

+    # Store everything.
+    funs = [p_direction]
+    func_names = [:p_direction]
+
+    # Summarize.
+    summary_df = summarize(
+        chains, funs...;
+        func_names=func_names,


To avoid alloctions:

Suggested change

# Store everything.

funs = [p_direction]

func_names = [:p_direction]

# Summarize.

summary_df = summarize(

chains, funs...;

func_names=func_names,

# Summarize.

summary_df = summarize(

chains, p_direction;

func_names=[:p_direction],

yebai · 2023-08-30T09:16:10Z

Closed in favour of arviz-devs/PosteriorStats.jl#10

DominiqueMakowski added 2 commits July 11, 2023 21:41

Add p_direction() to significance.jl

ec02b6f

export function

b38f364

yebai approved these changes Jul 12, 2023

View reviewed changes

devmotion reviewed Jul 12, 2023

View reviewed changes

This was referenced Jul 29, 2023

Add function: p_direction() TuringLang/MCMCDiagnosticTools.jl#90

Closed

P-Direction in zero-inflated models easystats/bayestestR#616

Closed

yebai closed this Aug 30, 2023

DominiqueMakowski deleted the p_direction branch August 30, 2023 09:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Probability of Direction (pd) #428

[Feature] Probability of Direction (pd) #428

DominiqueMakowski commented Jul 11, 2023

devmotion commented Jul 11, 2023

codecov bot commented Jul 11, 2023

DominiqueMakowski commented Jul 12, 2023

yebai left a comment

devmotion commented Jul 12, 2023

devmotion left a comment

devmotion Jul 12, 2023

devmotion Jul 12, 2023

DominiqueMakowski Jul 12, 2023

devmotion Jul 12, 2023

cpfiffer Jul 12, 2023

devmotion Jul 12, 2023

yebai commented Aug 30, 2023

		@@ -0,0 +1,19 @@
		function p_direction(x::Vector{Float64})
		return maximum([sum(x .> 0) ./ length(x), sum(x .< 0) ./ length(x)])

-    return maximum([sum(x .> 0) ./ length(x), sum(x .< 0) ./ length(x)])
+    ntotal = 0
+    npositive = 0
+    nnegative = 0
+    for xi in x
+        if xi > 0
+            npositive += 1
+        elseif xi < 0
+            nnegative += 1
+        end
+        ntotal += 1
+    end
+    return max(npositive, nnegative) / ntotal

[Feature] Probability of Direction (pd) #428

[Feature] Probability of Direction (pd) #428

Conversation

DominiqueMakowski commented Jul 11, 2023

devmotion commented Jul 11, 2023

codecov bot commented Jul 11, 2023

Codecov Report

DominiqueMakowski commented Jul 12, 2023

yebai left a comment

Choose a reason for hiding this comment

devmotion commented Jul 12, 2023

devmotion left a comment

Choose a reason for hiding this comment

devmotion Jul 12, 2023

Choose a reason for hiding this comment

devmotion Jul 12, 2023

Choose a reason for hiding this comment

DominiqueMakowski Jul 12, 2023

Choose a reason for hiding this comment

devmotion Jul 12, 2023

Choose a reason for hiding this comment

cpfiffer Jul 12, 2023

Choose a reason for hiding this comment

devmotion Jul 12, 2023

Choose a reason for hiding this comment

yebai commented Aug 30, 2023