-
Notifications
You must be signed in to change notification settings - Fork 3
/
optimization_ideas.txt
124 lines (119 loc) · 8.21 KB
/
optimization_ideas.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
Thoughts concerning optimization:
- I should not optimize until I have access to real-world input samples. Any
optimizations are bound to result in increased code complexity, which is not
worth it unless I can demonstrate a non-trivial reduction in running time in a
real-world setting.
- I should profile the program and see if there are any quick wins.
- Most obviously, I could implement the optimization of using a B-tree in the
top level, and see what effect this has on performance. It would definitely
improve the query time, and hopefully it wouldn't affect the update time too
much.
- If I do implement the B-tree optimization, I should also store a binary search
tree representation of the same forest in the top level, for the user-defined
augmentation. That way, updates will require O(log N) calls to
Augmentation.combine, rather than O(log^2 N / log log N) calls. We have no
control over how long Augmentation.combine takes, so we should minimize calls
to that method. It will take a bit of care to ensure that fetching the
augmentation for a connected component takes O(log N / log log N) time,
however.
- Alternatively, in each B-tree node, we could store a binary search tree that
combines the augmentation information for the items in that node. This might
even benefit ConnGraphs that do not have user-supplied augmentation.
- Actually, there is a way of speeding up queries by an O(log log N) factor that
should be cleaner and easier to implement than a B-tree. We could change each
Euler tour node in the top level to store its kth ancestor (e.g. if k = 3,
then each node stores a pointer to its great grandparent). Each time we change
a node's parent, we have to change the pointers for all of the descendants
that are less than k levels below the node - up to 2^k - 1 nodes. Since there
are O(log N) parent pointer changes per update, and updates already take
O(log^2 N) amortized time, we can afford a k value of up to lg lg N + O(1)
(but interestingly, not O(log log N)). However, I think this would be
significantly slower than a B-tree in practice.
- If I don't implement the B-tree optimization, there are a couple of small
optimizations I could try. First, I could move the field
EulerTourNode.augmentationFunc to EulerTourVertex, in order to save a little
bit of space. Second, I could create EulerTourNode and EulerTourVertex
subclasses specifically for the top level, and offload the fields related to
user augmentation to those subclasses. These changes seem inelegant, but it
might be worth checking the effect they have on performance.
- I looked at the heuristics recommended in
http://people.csail.mit.edu/karger/Papers/impconn.pdf (Iyer, et al. (2001): An
Experimental Study of Poly-Logarithmic Fully-Dynamic Connectivity Algorithms).
The first heuristic is: before pushing down all of the same-level forest
edges, which is an expensive operation, sample O(log N) same-level non-forest
edges to see if we can get lucky and find a replacement edge without pushing
anything. The second heuristic is to refrain from pushing edges in
sufficiently small components.
The first heuristic seems reasonable. However, I don't get the second
heuristic. The concept seems to be basically the same as the first - don't
push down any edges if we can cheaply find a replacement edge. However, the
execution is cruder. Rather than limiting the number of edges to search, which
is closely related to the cost of the search, the second heuristic is based on
the number of vertices, which is not as closely related.
- I have a few ideas for implementing this first heuristic, which could be
attempted and their effects on performance measured. The first is that to
sample the graph edges in a semi-random order, I could augment each Euler tour
node with the sum across all canonical visits to vertices of the number of
adjacent same-level graph edges. Then, to obtain a sample, we select each
vertex with a probability proportional to this adjacency number. This is
fairly straightforward: at each node, we decide to go left, go right, or use
the current vertex in proportion to the augmentation.
This is not exactly random, because after we select a vertex, we choose an
arbitrary adjacent edge. However, it seems close enough, and it's not easy to
do better.
- After sampling an edge, we should remove it from the adjacency list, to avoid
repeatedly sampling the same edge. We should then store it in an array of
edges we sampled, so that later, we can either re-add the edges to the
adjacency lists or push all of them down as necessary.
- If the sampling budget (i.e. the number of edges we intend to sample) is at
least the number of same-level graph edges, then we should forgo pushing down
any edges, regardless of whether we find a replacement edge.
- We don't need to sample edges from the smaller of the two post-cut trees. We
can sample them from the larger one if it has fewer same-level graph edges.
This increases our chances of finding a replacement edge if there is one. As
long as we don't push down any forest edges in the larger tree, we're safe.
- With an extra augmentation, we can determine whether there is probably a
replacement edge. This helps us because if there is probably no replacement
edge, then we can save some time by skipping edge sampling entirely. (If the
sampling budget is at least the number of same-level graph edges, then we
should also refrain from pushing down any edges, as in a previously mentioned
optimization.)
Assign each ConnEdge a random integer ID. Store the XOR of the IDs of all
adjacent same-level graph edges in each of the vertices. Augment the Euler
tour trees with the XOR of those values for all canonical visits to vertices.
The XOR stored in a post-cut tree's root node is equal to the XOR of all of
the replacement edges' IDs, because each non-replacement edge is in two
adjacency lists and cancels itself out, while each replacement edge is in one
adjacency list. Thus, the XOR is 0 if there is no replacement edge, and it is
non-zero with probability 1 - 1 / 2^32 if there is at least one replacement
edge.
- If one of the post-cut trees has a same-level forest edge and the other does
not, and the difference in the number of same-level graph edges is not that
large, we should favor the one that does not, because it's expensive to push
forest edges. Also, there's no need to sample if we pick a tree that has no
same-level forest edges.
- I wonder if there's a good way to estimate the cost of pushing down a given
set of forest edges. For example, is there a strong correlation between the
number of forest edges and the cost of pushing them? We could use a larger
sampling budget the greater this cost is.
- During sampling, it might help to push down edges whose endpoints are
connected in the next-lower level, and to not count them against the sampling
budget. By paying for some of the edges, we're able to sample more edges, so
we're less likely to have to push down the forest edges.
The downside is that checking whether the endpoints are in the same connected
component takes O(log N) time. To mitigate this, we should refrain from
attempting to push down edges until necessary. That is, we should spend the
entire sampling budget first, then search the edges we sampled for an edge
that we can push down. After finding one such edge, we should sample another
edge, then search for another edge to push down, etc.
- When we are done sampling edges, it might still be beneficial to iterate over
the rest of the same-level graph edges in a (semi-)random order. This does
increase the amount of time that iteration takes. However, using a random
order could be helpful if the sequence of updates has some sort of pattern
affecting the location of replacement edges. For example, if there is a
tendency to put replacement edges near the end of the Euler tour, or to
cluster replacement edges so that they are close to each other in the Euler
tour, then random iteration should tend to locate a replacement edge faster
than in-order iteration. (If we know from a previously mentioned optimization
that there is probably no replacement edge, then we shouldn't bother to
iterate over the edges in random order.)