Skip to content

Mann Whitney U test

Esteban Zapata Rojas edited this page Jan 18, 2018 · 1 revision

Mann-Whitney U

This is an implementation of the Wilcoxon Rank-Sum test, also known as Mann-Whitney U test. It is a non-parametric test used to know if is equally likely that a randomly selected value from one sample will be less than or greater than a randomly selected value from a second sample.

Although this test has been implemented under the WilcoxonRankSumTest.new, this can be instanciated using the alias MannWhitneyU.new.

Instance methods

rank

This method receives the elements that needs to be ranked. It returns a hash with unique key-value pairs, where the key is the element and the value is another hash, that contains how many ties there are present and the ranking. It ranks in ascending order.

In the following example { 1 => { counter: 1, rank: 1 }, 2 => { counter: 2, rank: 3 } }, there is one tie with the element 2, so each 2 element has a ranking of 3/2.

pry(main)> test = StatisticalTest::WilcoxonRankSumTest.new
=> #<Statistics::StatisticalTest::WilcoxonRankSumTest:0x00000001556698>
pry(main)> test.rank([1,1,2,3,4,5,6,7,7,8])
=> {1=>{:counter=>2, :rank=>3},
 2=>{:counter=>1, :rank=>3},
 3=>{:counter=>1, :rank=>4},
 4=>{:counter=>1, :rank=>5},
 5=>{:counter=>1, :rank=>6},
 6=>{:counter=>1, :rank=>7},
 7=>{:counter=>2, :rank=>17},
 8=>{:counter=>1, :rank=>10}}

perform test

This method performs the Mann-Whitney U test. It expects the alpha value, the tail and the two groups to be compared. The tail param can be :one_tail or :two_tail to specify if the method should perform a one or two tailored test.

Keep in mind that this method performs a test using the Z statistic for a normal distribution, generated from the U statistic. The normal distribution used has a mean of o and a standard deviation of 1.

It returns a hash with the following keys:

  • probability: it calculates the probability of the z statistic, using the Standard Normal CDF.
  • u: The U statistic.
  • z: The Z statistic, which is just a transformation of the U statistic.
  • p_value: It returns the p value, calculated as 1 - probability.
  • alpha: the specified alpha value.
  • null: Either true or false. If true, it means that the null hypothesis should not be rejected.
  • alternative: Either true or false. If true, it means that the null hypothesis can be rejected.
  • confidence_level: Defined as 1 - alpha.

Keep in mind that the null and alternative keys cannot be true at the same time.

pry(main)> group_one
=> [0.12819915256260872, 0.24345459073897613, 0.27517650565714014, 0.8522185144081152, 0.05471111219486524]
pry(main)> group_two
=> [0.3272414061985621, 0.2989306116723194, 0.642664937717922]
# Alpha of 0.05 and one tailored test
pry(main)> test.perform(alpha = 0.05, :one_tail, group_one, group_two)
=> {:probability=>0.9101437525605001, :u=>3.0, :z=>-1.3416407864998738, :p_value=>0.08985624743949994, :alpha=>0.05, :null=>true, :alternative=>false, :confidence_level=>0.95}
# Alpha of 0.01 and one tailored test
pry(main)> test.perform(alpha = 0.01, :one_tail, group_one, group_two)
=> {:probability=>0.9101437525605001, :u=>3.0, :z=>-1.3416407864998738, :p_value=>0.08985624743949994, :alpha=>0.01, :null=>true, :alternative=>false, :confidence_level=>0.99}
# Alpha of 0.01 and two tailored test
pry(main)> test.perform(alpha = 0.01, :two_tail, group_one, group_two)
=> {:probability=>0.9101437525605001, :u=>3.0, :z=>-1.3416407864998738, :p_value=>0.17971249487899987, :alpha=>0.01, :null=>true, :alternative=>false, :confidence_level=>0.99}