4.3 Heapsort: Fast Sorting via Data Structures
Where in the Heap?
Problem: Given an array-based heap on n elements and a real number x, efficiently determine whether the kth smallest element in the heap is greater than or equal to x. Your algorithm should be O(k) in the worst-case, independent of the size of the heap. Hint: you do not have to find the kth smallest element; you need only determine its relationship to x.
Solution: There are at least two different ideas that lead to correct but inefficient algorithms for this problem:
- Call extract-minktimes, and test whether all of these are less thanx. This explicitly sorts the firstkelements and so gives us more information than the desired answer, but it takes O(klogn) time to do so.
- The kth smallest element cannot be deeper than the kth level of the heap, since the path from it to the root must go through elements of decreasing value. Thus we can look at all the elements on the first k levels of the heap, and count how many of them are less thanx, stopping when we either find k of them or run out of elements. This is correct, but takes O(min(n,2k-1)) time, since the top k elements have 2k-1 elements.
An O(k) solution can look at only k elements smaller than x, plus at most O(k) elements greater than x. Consider the following recursive procedure, called at the root with i= 1 with count=k:
Contiguous vs. Linked Data Structures
- Contiguously-allocated structuresare composed of single slabs of memory, and include arrays, matrices, heaps, and hash tables.
- Linked data structuresare composed of distinct chunks of memory bound together bypointers, and include lists, trees, and graph adjacency lists.
- Type Deduction
- Computational Complexity
- Essential and Accidental Complexity
1 2 3 4 5 6 7 8 9 10 11
Our two most important tools are (1) the RAM model of computation and (2) the asymptotic analysis of worst-case complexity.
- Permutations - arrangements, or orderings, of items (“arrangement” “tour” “ordering” or “sequence” )
- Subsets - selections from a set of items (“cluster” “collection” “committee” “group” “packaging” or “selection”)
- Trees - hierarchical relationships between items (“hierarchy” “dominance relationship” “ancestor/descendant relationship” or “taxonomy”)
- Graphs - relationships between arbitrary pairs of objects (“network” “circuit” “web” or “relationship”)
- Points - locations in some geometric space (“sites” “positions” or “locations.)
- Polygons - regions in some geometric spaces (“shapes” “regions” or “boundaries”)
- Strings - sequences of characters or patterns. (“text” “characters” “patterns” or “labels”)
全书分为 3 部分：
- 预备知识：总的概括代码涉及的知识点，比如算法和数据结构的重要性和合理性，如何写出正确的代码并证明，如何测试代码，性能评估代码，debug 等。
- 性能：先是介绍一些估算的技巧，比如 72 法则，利特尔法则(Little’s law) 等，之后展开代码算法的性能，如何调试代码使得性能更好或更省空间。