-
-
Notifications
You must be signed in to change notification settings - Fork 519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
binary-search-tree: Binary Search Tree Exercise #513
Comments
@remcopeereboom When thinking about what operations to include on a BST I look back to CS Data Structures course and they usually always go through the main 4 (Traverse, Search, Insert, Delete). Quick question how would implementing delete change the behavior of search? I've always seen these methods implemented separately of each other. Regarding the different ways of traversal I don't think there is a "wrong" way, but through nits you can encourage more efficiency. I think the goal for this particular problem (and any algorithm type problem) is not only to practice the language but also learn the algorithm/data structure and its benefits since they are building blocks for a deeper understanding of programming in general. |
The delete operations can cause the tree to be unbalanced towards one or the other direction. It's possible to end up with a tree that looks like:
This transforms the search from logarithmic to linear worst case and average case performance. The problem stems from the fact that while it is trivial to delete nodes that have either zero children or one child, deleting nodes that have two children mean that you now have to select a branch to be the top node. Depending on your bias towards either the predecessor or the successor child node, this can eventually lead to either a left-biased or a right-biased tree respectively. This will lead to a tree that has linear performance for the major operations. You can mitigate the problem by alternating left and right biasses. Experimentally (after many random insertions or deletions) this leads to O(sqrt n) average case performance, because deletions and insertions take place in random branches (apparently the alternating bias isn't quite enough to keep the tree perfectly balanced, but still good enough to prevent the tree from being linear). I'm pretty sure that nobody has proven this algebraically yet however, so that is still an open problem in computer science! |
@remcopeereboom |
Absolutely. Worst case performance of a BST is O(n), not O(log n). But generally we don't have worst case performance because we don't insert elements in monotonically increasing or decreasing order. Instead we insert elements at random. The average case or expected performance is O(log n) for a BST with random insertions. But only O(sqrt n) for a tree with random insertions and deletions.
A self balancing tree like a Red-Black Tree does solve this problem. In fact Red-Black trees are used specifically for this purpose! |
Good points @remcopeereboom I had to do a little research and refresh my brain a bit. I still think there is value in implementing the delete operation as it requires some thought, and is considered one of the basic operations of a BST (a least from an educational standpoint). Through nits one can encourage a more robust BST, or this exercise could just be a "basic" BST and there could be a different exercise that builds on the basic tree and deals with balancing. |
I can get behind that.
There's value in both approaches. My concern with the nits is that I think most people won't get one. Perhaps some information in the README on the performance would nudge people in the right direction? |
I've been working on writing new tests for this problem that fix some of the issues discussed above, will provide a little more challenge, and will hopefully guide people a little more to some Ruby conventions for data structures. I am kind of stuck on whether to force the tree to add nodes in the order they are given or to allow people to reorder the tree as they see fit as long as they hold to the requirements that all value left are <= to the parent and that all values to the right are > to the parent. Didactically I think fixing the node order is best. It also has the benefit of making the tests easier. But the downside is that there is less flexibility in how to implement things. I would love other people's thoughts on that. This question also has some implications for what to consider tree identity. Do we care only that the tree contains the same items, or do we care in which subtrees they are in. I am strongly leaning towards the latter, because it seems reasonable that if The proposed (expanded) API: class BST
def self.new
# Returns an empty tree.
end
def size
# Returns the size of the tree.
end
def empty?
# Returns true if the tree is empty, false otherwise.
end
def item
# Returns the item at the root of the tree.
end
def left
# Returns the left sub-tree.
end
def right
# Returns the right sub-tree.
end
def include?(item)
# Returns true if the item is in the tree, false otherwise.
end
def insert(item)
# Inserts an item into the tree.
end
def delete(item)
# Deletes the item if it is in the tree, does nothing if it is not.
end
def each_in_order(&block)
# Yields the elements in sorted order
# Returns an enumerator to each_in_order if no block is given.
# Returns self if a block is given.
end
def ==(other)
# Returns false if other is not a BST.
# Returns true if item in root equals item in other's root and left == other.left and right == other.right,
# false othewise.
end
end |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Hey I've been working on the BST exercise over at exercism/crystal#72 - and while researching other implementations I noticed a lot aren't fully implementing a binary search tree.
We are only implementing and testing insertion and traversal, but not searching and deletion. A binary search tree is a great data structure to learn and implementing all the operations provide additional challenge to the programmer. Maybe we could structure the tests in approximate descending difficulty - search, traverse, insert, delete.
I also think we should separate tests which are language specific (with some sort of annotation) from the actual BST functionality (i.e. returning an enumerator).
The text was updated successfully, but these errors were encountered: