Binary Search Trees in Ruby

There’s never a lack of options for ways to store data. Depending on the type of data you’re dealing with, you could do much worse than a binary search tree.

The concept behind binary search trees is simple.

  1. The tree is node based, and a new tree is initialized with a root node. This node has a value, a left space, and a right space. Both left and right spaces are empty to begin with.

  2. When you want to add a new node, the tree compares the value of the new node to the value of the root node.

  3. If the new node’s value is less than the root’s value, it’s stored in the root node’s left space. If there’s already a node there, then compare against that node and repeat.

  4. If the new node’s value is greater than the root’s value, it’s stored in the root node’s right space. If there’s already a node there, then compare against that node and repeat.

That’s it!

Things get a little more complex once the tree starts to fill out, and the algorithm has to continue traversing down each branch of the tree to find the perfect spot for your new node.

There’s also the question of inserting nodes with duplicate values. While it’s strongly recommended to exclude duplicate values from a binary search tree, you can elect to store them to the left (by allowing less than or equal to nodes) or right (by allowing greater than or equal to nodes). If you do allow this, you run the risk of creating an unbalanced tree with different numbers of branches on either side.

One way of implementing binary search trees in Ruby is with two separate classes: one for the tree itself and one for the node objects that populate it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
class BST

  attr_accessor :data

  def initialize(data)
    @root = Node.new(data)
    @sorted_data = []
  end

  def insert(val)
    @root.insert(val)
  end

  def traverse(node=@root)
    traverse(node.left) if node.left
    @sorted_data << node.data
    traverse(node.right) if node.right
    @sorted_data
  end

end

class Node

  attr_accessor :data, :left, :right

  def initialize(data)
    @data = data
    @left = nil
    @right = nil
  end

  def insert(val)
    new_node = Node.new(val)
    if new_node.data < data && left.nil?
      self.left = new_node
    elsif new_node.data < data
      left.insert(val)
    elsif right.nil?
      self.right = new_node
    else
      right.insert(val)
    end
  end

end

As you use the insert method on the tree to add new nodes, it filters down to the insert instance method on the root node, then recursively calls itself on each child node as it searches for an empty place to attach the new node to the tree.

The traverse method also uses recursion to visit each node and add their values to an array, starting down the left-most node at the bottom of the tree and working its way over to the right-most node.

Binary search trees perform relatively well, with average time complexity of O(log(n)) for all operations and O(n) at their worst.

However, since the shape of the tree is wholly dependent on the order that new nodes are added, the overall structure of the tree may degrade over time after numerous deletions. Also, since tree traversal requires comparisons against each node, very large trees may rack up longer search times.

Comments