-
Notifications
You must be signed in to change notification settings - Fork 0
PhilLavoie/NIT_RBTrees
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Packages provided for the extension of the NIT standard library. In particular, 4 collections are provided based on the red black tree: TreeSet, TreeMultiset, TreeMap and TreeMultimap. One package provides all the source needed to make those collections work. The other package (tests), is one for automating regression testing. How to run the tests from the main directory: nitc -o OUTFILE -I src tests/red_black_trees_tests.nit Design choices: -Minimal convenience for first release-> But the power to do everything. -Some kind of conformity with the collections in NIT library has been provided, but no direct plugging under a superclass. -Minimal removal facilities through iterators: prevents a lot of confusion and different naming conventions (remove_at can be used to implement remove, remove_once, remove_all, remove_entirely, remove_key, remove_once_key and so on) -Retrievals returns iterators. Why? Because it is better suited for NIT than elements, and it just happens to be my favorite way of doing this. Example of why it is better suited: lowest(): Element, if the collection is empty than the proper way of doing this would be to abort the program, since Element might not be nullable. That is ok because the person didn't probably know that the tree is empty. Same as trying to access an out of bounds element on an array. This is a situation where it isn't so black and white: ceiling( e: Element ): Element. Someone would like to know what is the first element closest to e, but higher if e does not exist. Should the program abort because no elements satisfying the condition exists??? Isn't ceiling a way of asking if there is an element higher than e that exists? One could argue that we could provide a has_ceiling method and start bloating the interface this way for every other method we add (floor, lower, higher). There are other ways to handle the problems, but IMHO, the iterator way is the simplest yet most powerful way of doing this (gives a lot of flexibility) and it just happens to be the choice made for removals as well (convenient, isn't it?). Therefore: remove_at( ceiling( e ) ) removes the ceiling, if it exists, does nothing if it doesn't. Alternative NOT chosen: if has_ceiling( e ) then remove( ceiling( e ) ) end This solution isn't perfect. In the example above the client has less code to write. But if he wanted to remove e he would have more code to write: remove_at( find( e ) ) Instead of remove( e ) Convenient facilities should be added in the future. Another important downfall is that later in this text I propose another iterator for linear unordered access (for manipulating values in maps for example). However, doing this might cause a problem: why should the collection decide what kind of iterator to return? This opens a whole new discussion that is very interesting but out of the scope of this project. A solution would be to return a positional object from which any iterator can be constructed but then again someone could just argue: why would someone want an unordered iterator starting elsewhere than from the root? Open question. -The choice has been made to not differenciate methods names depending on if they work on keys or element. For example, has( e ) for a set/multiset have not been renamed to has_key( e ) for a map/multimap. The reason behind this is that it is obvious that the Map only knows how to compare keys, and not values. Therefore, the name is has( key ). However, this would have to be changed if someone would like to insert has_value( value ) to the tree. But beware, should a tree be aware of how to compare values? Also, we would have to change every other searching method: key_floor, key_ceiling, etc... Is there really a gain in clarity? I think not, but it is open for discussion, as usual. -The ground rule is: insert elements, use keys to acces things and return iterators on access. Remove using iterators. Iterators return elements. Elements and keys for set/multiset are the same. Elements are map entries for map and keys are keys (MapEntry[ K, V ] and K respectively). Sufficient satellite code has been provided to cover for the missing facilities. Example: no trees have a subset extraction method. Imagine that, for some reason, you would like to store all elements between min and max in an array. You can do this, provided you implement the array inserter (by redefinition for example) var a = new Array var algos = new Algos algos.copy( new BoundedIterator.inclusive( tree.ceiling( min ), tree.floor( max ) ), new BackInserter( a ) ) This allows the interface of the trees to be lighten. Note that I really don't know why someone would like to extract a subset, since iteration on subset is provided. What has been left aside: -Support for operations on values for maps. Is it: -Required -Advisable: I have yet to be convinced of why someone would try to operate on values in a map (remove, retrieve, mainly access operations, ...). -Some convenient methods have been added and some weren't. The choices are mostly arbitrary and further conveniences can easily be provided. See next point. -All facilities imaginable that uses the subset proposed can still be implemented, but most have been left aside. Following are some examples and some discussion on what should be considered if they are to be implemented. -For instance, pop_highest does not exist, but can be done like this: tree.remove_at( tree.highest() ) or tree.remove_at( tree.reverse_iterator ) -Another example: higher and lower methods. (Returns the first element lower or higher than the provided element). Can be done with floor and ceiling. Example: var f = tree.floor( 50 ) if f.item == 50 then f.previous end #Now f is on the lower element of 50, might be invalid. In fact, if it is to be implemented, it'll be exactly like that. NOTE that the interface has been designed in such a way that it works for BOTH multi and single version of trees (floor returns an iterator on the first element in the ascending order that corresponds to the one provided or the previous one if it does not exist). -Provide an insert_all, for convenience (insert_all( c:Collection...) ) -Provide a remove_all -> This is more ambiguous. What does remove_all means on a multiset for example? Remove all occurrences or remove one occurrence of every provided element. For a set remove_all surely means the latter. On a multiset for example, does it mean remove_all_once or remove_every_possible_element_that_matches??? See next point. -Implementing remove_once and remove_entirely for multi versions. Remove entirely can now be done like this: var f = tree.floor( 50 ) if f.item == 50 then while f.item == 50 do tree.remove_at( f ) end end Or like this: while tree.has( 50 ) do tree.remove_at( find( 50 ) ) end Remove once can be done like this: tree.remove_at( tree.find( 50 ) ) -operator [] on map for convenience. -Provide a way to extract all keys in O(n). I'm still not sure what would be the point of that, but it's possible. -Extract all values in O(n), same as above. -To answer both previous points, maybe it's just better to implement a breadth first iterator or something like that that does not provide any order, but guarantees a O(n) iteration. What's known tests haven't been implemented: -Test multimaps with duplicate keys having different values (though duplicate keys with same values have been tested)
About
An implementation of the red black tree in NIT. 4 interfaces layered on top are also provided, namely: a set, multiset, map and a multimap.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published