Given a set of n numbers, we wish to find the i largest in sorted order using a comparison- based algorithm. Find the algorithm that implements each of the following methods with the best asymptotic worst-case running time, and analyze the running times of the algorithms in terms of n and i.
a. Sort the numbers, and list the i largest. b. Build a max-priority queue from the numbers, and call EXTRACT-MAX i times. c. Use an order-statistic algorithm to find the ith largest number, partition around that number, and sort the i largest numbers.
a和b的代码在这里.code. 如果先排序的话,就需要O(nlgn)+O(i)的时间. 用堆的话,需要O(n)+O(ilogn)的时间.
对于c,code 利用顺序统计量的话,找到这个值需要O(n)的时间,然后需要对i个数排序需要O(ilogi)时间.
For n distinct elements with positive weights such that ,the weighted (lower) median is the element satisfying
and
a. Argue that the median of x1, x2, ..., xn is the weighted median of the xi with weights wi = 1/n for i = 1,2, ..., n.
b. Show how to compute the weighted median of n elements in O(n lg n) worst-case time using sorting.
c. Show how to compute the weighted median in Θ(n) worst-case time using a linear- time median algorithm such as SELECT from Section 9.3.
The post-office location problem is defined as follows. We are given n points p1, p2, ..., pn with associated weights w1, w2, ..., wn. We wish to find a point p (not necessarily one of the input points) that minimizes the sum
where d(a, b) is the distance between points a and b.
d. Argue that the weighted median is a best solution for the 1-dimensional post-office location problem, in which points are simply real numbers and the distance between points a and b is d(a, b) = |a - b|.
e. Find the best solution for the 2-dimensional post-office location problem, in which the points are (x, y) coordinate pairs and the distance between points a = (x1, y1) and b = (x2, y2) is the Manhattan distance given by d(a, b) = |x1 - x2| + |y1 - y2|.
a. 好显然TAT.
b. 先排序,然后开始遍历,直到加上某一个数字的权重 ≥ 1/2.
c. code
d. 假设p是带权中位数,只要偏离p,距离就会越来越大
e. 分别找x和y方向的带权中位数,因为用的是曼哈顿距离,x和y是独立的,所以可以分开对待.
The worst-case number T(n) of comparisons used by SELECT to select the ith order statistic from n numbers was shown to satisfy T(n) = Θ(n), but the constant hidden by the Θ-notation is rather large. When i is small relative to n, we can implement a different procedure that uses SELECT as a subroutine but makes fewer comparisons in the worst case.
a. Describe an algorithm that uses Ui(n) comparisons to find the ith smallest of n elements, where (Hint: Begin with disjoint pairwise comparisons, and recurse on the set containing the smaller element from each pair.)
b. Show that, if i < n/2, then Ui(n) = n + O(T (2i) lg(n/i)).
c. Show that if i is a constant less than n/2, then Ui(n) = n + O(lg n).
d. Show that if i = n/k for k≥2,then Ui(n) = n+O(T(2n/k)lgk).
a. 1. 如果i ≥ n/2,直接调用SELECT(n, i) 2. 如果i < n/2,那么就两个两个比较,生成了小的一组,同时也记录大的一组(比如可以用map映射起来) 3. 对小的一组递归调用这个函数 4. 当调用回来时,能找到这小的一组的第i个数,这个数左边都是比它小的数字,再从map中找到这i个数字对应的数字,这样总共有2i个数字,再对这2i个数字调用SELECT. 很tricky的一点是为什么最后要调用T(2i),看下面这个数字[1,2,3,4],order-2是2,可是分组完后较小元素是[1,3],所以还要对另外i个数字进行比较.
Follow @louis1992 on github to help finish this task